Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Chemosphere ; 353: 141510, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38401861

RESUMEN

Biotite, a phyllosilicate mineral, possesses significant potential for cesium (Cs) adsorption owing to its negative surface charge, specific surface area (SSA), and frayed edge sites (FES). Notably, FES are known to play an important role in the adsorption of Cs. The objectives of this study were to investigate the Cs adsorption capacity and behavior of artificially weathered biotite and identify mineralogical characteristics for the development of an eco-friendly geologically-based Cs adsorbent. Through various analyses, it was confirmed that the FES of biotite was mainly formed by mineral structural distortion during artificial weathering. The Cs adsorption capacity is improved by approximately 39% (from 20.53 to 28.63 mg g-1) when FES are formed in biotite through artificial weathering using a low-concentration acidic solution mixed with hydrogen peroxide (H2O2). Especially, the Cs selectivity in Cs-containing seawater, including high concentrations of cations and organic matter, was significantly enhanced from 203.2 to 1707.6 mL g-1, an increase in removal efficiency from 49.5 to 89.2%. These results indicate that FES of artificially weathered biotite play an essential role in Cs adsorption. Therefore, this simple and economical weathering method, which uses a low-concentration acidic solution mixed with H2O2, can be applied to natural minerals for use as Cs adsorbents.


Asunto(s)
Silicatos de Aluminio , Cesio , Peróxido de Hidrógeno , Cesio/química , Minerales/química , Compuestos Ferrosos/química , Adsorción
2.
ACS Appl Mater Interfaces ; 15(28): 33751-33762, 2023 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-37404033

RESUMEN

Solution-processed metal-oxide thin-film transistors (TFTs) with different metal compositions are investigated for ex situ and in situ radiation hardness experiments against ionizing radiation exposure. The synergetic combination of structural plasticity of Zn, defect tolerance of Sn, and high electron mobility of In identifies amorphous zinc-indium-tin oxide (Zn-In-Sn-O or ZITO) as an optimal radiation-resistant channel layer of TFTs. The ZITO with an elemental blending ratio of 4:1:1 for Zn/In/Sn exhibits superior ex situ radiation resistance compared to In-Ga-Zn-O, Ga-Sn-O, Ga-In-Sn-O, and Ga-Sn-Zn-O. Based on the in situ irradiation results, where a negative threshold voltage shifts and a mobility increase as well as both off current and leakage current increase are observed, three factors are proposed for the degradation mechanisms: (i) increase of channel conductivity, (ii) interface-trapped and dielectric-trapped charge buildup, and (iii) trap-assisted tunneling in the dielectric. Finally, in situ radiation-hard oxide-based TFTs are demonstrated by employing a radiation-resistant ZITO channel, a thin dielectric (50 nm SiO2), and a passivation layer (PCBM for ambient exposure), which exhibit excellent stability with an electron mobility of ∼10 cm2/V s and aΔVth of <3 V under real-time (15 kGy/h) gamma-ray irradiation in an ambient atmosphere.

3.
IEEE Trans Nanobioscience ; 22(4): 771-779, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37163410

RESUMEN

Cancer metastasis is a complex process which involves the spread of tumor cells from the primary site to other parts of the body. Metastasis is the major cause of cancer mortality, accounting for about 90% of cancer deaths. Metastasis is primarily diagnosed by clinical examinations and imaging techniques, but such a diagnosis is made after metastasis has occurred. Prediction or early detection of metastasis is important for treatment planning since it has an impact on the survival of patients. Recently a few methods have been developed to predict lymph node metastasis, but few methods are available for predicting distant metastasis. Motivated by a gene regulation mechanism involving miRNAs, we have developed a new method for predicting both lymph node metastasis and distant metastasis. We have derived differential correlations of miRNAs and their target RNAs in cancer, and built prediction models using the differential correlations. Testing the method on several types of cancer showed that differential correlations of miRNAs and target RNAs are much more powerful and stable than expressions of known metastasis predictive genes in predicting distant metastasis as well as lymph node metastasis. The method developed in this study will be useful in predicting metastasis and thereby in determining treatment options for cancer patients.

4.
Int J Mol Sci ; 24(5)2023 Mar 06.
Artículo en Inglés | MEDLINE | ID: mdl-36902481

RESUMEN

Despite remarkable progress in cancer research and treatment over the past decades, cancer ranks as a leading cause of death worldwide. In particular, metastasis is the major cause of cancer deaths. After an extensive analysis of miRNAs and RNAs in tumor tissue samples, we derived miRNA-RNA pairs with substantially different correlations from those in normal tissue samples. Using the differential miRNA-RNA correlations, we constructed models for predicting metastasis. A comparison of our model to other models with the same data sets of solid cancer showed that our model is much better than the others in both lymph node metastasis and distant metastasis. The miRNA-RNA correlations were also used in finding prognostic network biomarkers in cancer patients. The results of our study showed that miRNA-RNA correlations and networks consisting of miRNA-RNA pairs were more powerful in predicting prognosis as well as metastasis. Our method and the biomarkers obtained using the method will be useful for predicting metastasis and prognosis, which in turn will help select treatment options for cancer patients and targets of anti-cancer drug discovery.


Asunto(s)
MicroARNs , Humanos , MicroARNs/genética , ARN Mensajero/genética , Metástasis Linfática , Biomarcadores de Tumor/genética , Redes Reguladoras de Genes , Regulación Neoplásica de la Expresión Génica , Perfilación de la Expresión Génica
5.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 2671-2680, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36227824

RESUMEN

Inspired by a newly discovered gene regulation mechanism known as competing endogenous RNA (ceRNA) interactions, several computational methods have been proposed to generate ceRNA networks. However, most of these methods have focused on deriving restricted types of ceRNA interactions such as lncRNA-miRNA-mRNA interactions. Competition for miRNA-binding occurs not only between lncRNAs and mRNAs but also between lncRNAs or between mRNAs. Furthermore, a large number of pseudogenes also act as ceRNAs, thereby regulate other genes. In this study, we developed a general method for constructing integrative networks of all possible interactions of ceRNAs in renal cell carcinoma (RCC). From the ceRNA networks we derived potential prognostic biomarkers, each of which is a triplet of two ceRNAs and miRNA (i.e., ceRNA-miRNA-ceRNA). Interestingly, some prognostic ceRNA triplets do not include mRNA at all, and consist of two non-coding RNAs and miRNA, which have been rarely known so far. Comparison of the prognostic ceRNA triplets to known prognostic genes in RCC showed that the triplets have a better predictive power of survival rates than the known prognostic genes. Our approach will help us construct integrative networks of ceRNAs of all types and find new potential prognostic biomarkers in cancer.

6.
Biomolecules ; 12(7)2022 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-35883535

RESUMEN

Breast cancer is one of the most prevalent cancers in females, with more than 450,000 deaths each year worldwide. Among the subtypes of breast cancer, basal-like breast cancer, also known as triple-negative breast cancer, shows the lowest survival rate and does not have effective treatments yet. Somatic mutations in the TP53 gene frequently occur across all breast cancer subtypes, but comparative analysis of gene correlations with respect to mutations in TP53 has not been done so far. The primary goal of this study is to identify gene correlations in two groups of breast cancer patients and to derive potential prognostic gene pairs for breast cancer. We partitioned breast cancer patients into two groups: one group with a mutated TP53 gene (mTP53) and the other with a wild-type TP53 gene (wtTP53). For every gene pair, we computed the hazard ratio using the Cox proportional hazard model and constructed gene correlation networks (GCNs) enriched with prognostic information. Our GCN is more informative than typical GCNs in the sense that it indicates the type of correlation between genes, the concordance index, and the prognostic type of a gene. Comparative analysis of correlation patterns and survival time of the two groups revealed several interesting findings. First, we found several new gene pairs with opposite correlations in the two GCNs and the difference in their correlation patterns was the most prominent in the basal-like subtype of breast cancer. Second, we obtained potential prognostic genes for breast cancer patients with a wild-type TP53 gene. From a comparative analysis of GCNs of mTP53 and wtTP53, we found several gene pairs that show significantly different correlation patterns in the basal-like breast cancer subtype and obtained prognostic genes for patients with a wild-type TP53 gene. The GCNs and prognostic genes identified in this study will be informative for the prognosis of survival and for selecting a drug target for breast cancer, in particular for basal-like breast cancer. To the best of our knowledge, this is the first attempt to construct GCNs for breast cancer patients with or without mutations in the TP53 gene and to find prognostic genes accordingly.


Asunto(s)
Neoplasias de la Mama , Neoplasias de la Mama Triple Negativas , Neoplasias de la Mama/genética , Neoplasias de la Mama/terapia , Femenino , Genes p53 , Humanos , Mutación , Modelos de Riesgos Proporcionales , Neoplasias de la Mama Triple Negativas/genética , Proteína p53 Supresora de Tumor/genética
7.
IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1267-1276, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-32809942

RESUMEN

Many of the known prognostic gene signatures for cancer are individual genes or combination of genes, found by the analysis of microarray data. However, many of the known cancer signatures are less predictive than random gene expression signatures, and such random signatures are significantly associated with proliferation genes. With the availability of RNA-seq gene expression data for thousands of human cancer patients, we have analyzed RNA-seq and clinical data of cancer patients and constructed gene correlation networks specific to individual cancer patients. From the patient-specific gene correlation networks, we derived prognostic gene pairs for three types of cancer. In this paper, we propose a new method for inferring prognostic gene pairs from patient-specific gene correlation networks. The main difference of our method from previous ones includes (1) it is focused on finding prognostic gene pairs rather than prognostic genes, (2) it can identify prognostic gene pairs from RNA-seq data even when no significant prognostic genes exist, and (3) prognostic gene pairs can serve as robust prognostic biomarkers in the sense that most prognostic gene pairs show little association with proliferation genes, the major boosting factor of the predictive power of random gene signatures. Evaluation of our method with extensive data of three types of cancer (liver cancer, pancreatic cancer, and stomach cancer) showed that our approach is general and that gene pairs can serve as more reliable prognostic signatures for cancer than genes. Analysis of patient-specific gene networks suggests that prognosis of individual cancer patients is affected by the existence of prognostic gene pairs in the patient-specific network and by the size of the patient-specific network. Although preliminary, our approach will be useful for finding gene pairs to predict survival time of patients and to tailor treatments to individual characteristics. The program for dynamically constructing patient-specific gene networks and for finding prognostic gene pairs is available at http://bclab.inha.ac.kr/LPS.


Asunto(s)
Redes Reguladoras de Genes , Neoplasias Hepáticas , Biomarcadores de Tumor/genética , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Humanos , Neoplasias Hepáticas/genética , Pronóstico , RNA-Seq , Transcriptoma
8.
Comput Methods Programs Biomed ; 212: 106465, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34715518

RESUMEN

BACKGROUND AND OBJECTIVE: Most prognostic gene signatures that have been known for cancer are either individual genes or combination of genes. Both individual genes and combination of genes do not provide information on gene-gene relations, and often have less prognostic significance than random genes associated with cell proliferation. Several methods for generating sample-specific gene networks have been proposed, but programs implementing the methods are not publicly available. METHODS: We have developed a method that builds gene correlation networks specific to individual cancer patients and derives prognostic gene correlations from the networks. A gene correlation network specific to a patient is constructed by identifying gene-gene relations that are significantly different from normal samples. Prognostic gene pairs are obtained by carrying out the Cox proportional hazards regression and the log-rank test for every gene pair. RESULTS: We built a web application server called GeneCoNet with thousands of tumor samples in TCGA. Given a tumor sample ID of TCGA, GeneCoNet dynamically constructs a gene correlation network specific to the sample as output. As an additional output, it provides information on prognostic gene correlations in the network. GeneCoNet found several prognostic gene correlations for six types of cancer, but there were no prognostic gene pairs common to multiple cancer types. CONCLUSION: Extensive analysis of patient-specific gene correlation networks suggests that patients with a larger subnetwork of prognostic gene pairs have shorter survival time than the others and that patients with a subnetwork that contains more genes participating in prognostic gene pairs have shorter survival time than the others. GeneCoNet can be used as a valuable resource for generating gene correlation networks specific to individual patients and for identifying prognostic gene correlations. It is freely accessible at http://geneconet.inha.ac.kr.


Asunto(s)
Redes Reguladoras de Genes , Neoplasias , Perfilación de la Expresión Génica , Humanos , Neoplasias/genética , Pronóstico
9.
Comput Biol Chem ; 84: 107171, 2020 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31931434

RESUMEN

Recent advances in high-throughput experimental technologies have generated a huge amount of data on interactions between proteins and nucleic acids. Motivated by the big experimental data, several computational methods have been developed either to predict binding sites in a sequence or to determine if an interaction exists between protein and nucleic acid sequences. However, most of the methods cannot be used to discover new nucleic acid sequences that bind to a target protein because they are classifiers rather than generators. In this paper we propose a generative model for constructing protein-binding RNA sequences and motifs using a long short-term memory (LSTM) neural network. Testing the model for several target proteins showed that RNA sequences generated by the model have high binding affinity and specificity for their target proteins and that the protein-binding motifs derived from the generated RNA sequences are comparable to the motifs from experimentally validated protein-binding RNA sequences. The results are promising and we believe this approach will help design more efficient in vitro or in vivo experiments by suggesting potential RNA aptamers for a target protein.


Asunto(s)
Modelos Biológicos , Proteínas de Unión al ARN/metabolismo , ARN/metabolismo , Sitios de Unión , Biología Computacional/métodos , Motivos de Nucleótidos
10.
BMC Genomics ; 20(Suppl 13): 967, 2019 Dec 27.
Artículo en Inglés | MEDLINE | ID: mdl-31881936

RESUMEN

BACKGROUND: Interactions between protein and nucleic acid molecules are essential to a variety of cellular processes. A large amount of interaction data generated by high-throughput technologies have triggered the development of several computational methods either to predict binding sites in a sequence or to determine whether a pair of sequences interacts or not. Most of these methods treat the problem of the interaction of nucleic acids with proteins as a classification problem rather than a generation problem. RESULTS: We developed a generative model for constructing single-stranded nucleic acids binding to a target protein using a long short-term memory (LSTM) neural network. Experimental results of the generative model are promising in the sense that DNA and RNA sequences generated by the model for several target proteins show high specificity and that motifs present in the generated sequences are similar to known protein-binding motifs. CONCLUSIONS: Although these are preliminary results of our ongoing research, our approach can be used to generate nucleic acid sequences binding to a target protein. In particular, it will help design efficient in vitro experiments by constructing an initial pool of potential aptamers that bind to a target protein with high affinity and specificity.


Asunto(s)
ADN/metabolismo , Redes Neurales de la Computación , Proteínas/metabolismo , Algoritmos , Aptámeros de Nucleótidos/química , Aptámeros de Nucleótidos/metabolismo , Secuencia de Bases , Humanos , Conformación de Ácido Nucleico , Unión Proteica , Proteínas/química , Factores de Transcripción/metabolismo
11.
BMC Med Genomics ; 12(Suppl 8): 179, 2019 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-31856825

RESUMEN

BACKGROUND: Molecular characterization of individual cancer patients is important because cancer is a complex and heterogeneous disease with many possible genetic and environmental causes. Many studies have been conducted to identify diagnostic or prognostic gene signatures for cancer from gene expression profiles. However, some gene signatures may fail to serve as diagnostic or prognostic biomarkers and gene signatures may not be found in gene expression profiles. METHODS: In this study, we developed a general method for constructing patient-specific gene correlation networks and for identifying prognostic gene pairs from the networks. A patient-specific gene correlation network was constructed by comparing a reference gene correlation network from normal samples to a network perturbed by a single patient sample. The main difference of our method from previous ones includes (1) it is focused on finding prognostic gene pairs rather than prognostic genes and (2) it can identify prognostic gene pairs from gene expression profiles even when no significant prognostic genes exist. RESULTS: Evaluation of our method with extensive data sets of three cancer types (breast invasive carcinoma, colon adenocarcinoma, and lung adenocarcinoma) showed that our approach is general and that gene pairs can serve as more reliable prognostic signatures for cancer than genes. CONCLUSIONS: Our study revealed that prognosis of individual cancer patients is associated with the existence of prognostic gene pairs in the patient-specific network and the size of a subnetwork of the prognostic gene pairs in the patient-specific network. Although preliminary, our approach will be useful for finding gene pairs to predict survival time of patients and to tailor treatments to individual characteristics. The program for dynamically constructing patient-specific gene networks and for finding prognostic gene pairs is available at http://bclab.inha.ac.kr/pancancer.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Neoplasias/diagnóstico , Neoplasias/genética , Envejecimiento/genética , Femenino , Humanos , Masculino , Pronóstico , Caracteres Sexuales , Análisis de Supervivencia
12.
BMC Genomics ; 19(Suppl 6): 568, 2018 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-30367586

RESUMEN

BACKGROUND: Viral infection involves a large number of protein-protein interactions (PPIs) between virus and its host. These interactions range from the initial binding of viral coat proteins to host membrane receptor to the hijacking the host transcription machinery by viral proteins. Therefore, identifying PPIs between virus and its host helps understand the mechanism of viral infections and design antiviral drugs. Many computational methods have been developed to predict PPIs, but most of them are intended for PPIs within a species rather than PPIs across different species such as PPIs between virus and host. RESULTS: In this study, we developed a prediction model of virus-host PPIs, which is applicable to new viruses and hosts. We tested the prediction model on independent datasets of virus-host PPIs, which were not used in training the model. Despite a low sequence similarity between proteins in training datasets and target proteins in test datasets, the prediction model showed a high performance comparable to the best performance of other methods for single virus-host PPIs. CONCLUSIONS: Our method will be particularly useful to find PPIs between host and new viruses for which little information is available. The program and support data are available at http://bclab.inha.ac.kr/VirusHostPPI .


Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Proteínas Virales/metabolismo , Animales , Interacciones Microbiota-Huesped , Humanos , Análisis de Secuencia de Proteína , Proteínas Virales/química
13.
J Healthc Eng ; 2018: 1391265, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29854357

RESUMEN

Previous methods for predicting protein-protein interactions (PPIs) were mainly focused on PPIs within a single species, but PPIs across different species have recently emerged as an important issue in some areas such as viral infection. The primary focus of this study is to predict PPIs between virus and its targeted host, which are involved in viral infection. We developed a general method that predicts interactions between virus and host proteins using the repeat patterns and composition of amino acids. In independent testing of the method with PPIs of new viruses and hosts, it showed a high performance comparable to the best performance of other methods for single virus-host PPIs. In comparison of our method with others using same datasets, our method outperformed the others. The repeat patterns and composition of amino acids are simple, yet powerful features for predicting virus-host PPIs. The method developed in this study will help in finding new virus-host PPIs for which little information is available.


Asunto(s)
Interacciones Huésped-Patógeno/fisiología , Mapeo de Interacción de Proteínas/métodos , Proteínas Virales , Aminoácidos/química , Aminoácidos/metabolismo , Animales , Bovinos , Humanos , Ratones , Proteínas Virales/química , Proteínas Virales/metabolismo , Virosis/fisiopatología , Virosis/virología , Virus/química , Virus/patogenicidad
14.
Nucleic Acids Res ; 45(11): 6894-6910, 2017 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-28472401

RESUMEN

RNA-binding proteins (RBPs) are involved in mRNA splicing, maturation, transport, translation, storage and turnover. Here, we identified ACOT7 mRNA as a novel target of human WIG1. ACOT7 mRNA decay was triggered by the microRNA miR-9 in a WIG1-dependent manner via classic recruitment of Argonaute 2 (AGO2). Interestingly, AGO2 was also recruited to ACOT7 mRNA in a WIG1-dependent manner in the absence of miR-9, which indicates an alternative model whereby WIG1 controls AGO2-mediated gene silencing. The WIG1-AGO2 complex attenuated translation initiation via an interaction with translation initiation factor 5B (eIF5B). These results were confirmed using a WIG1 tethering system based on the MS2 bacteriophage coat protein and a reporter construct containing an MS2-binding site, and by immunoprecipitation of WIG1 and detection of WIG1-associated proteins using liquid chromatography-tandem mass spectrometry. We also identified WIG1-binding motifs using photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation analyses. Altogether, our data indicate that WIG1 governs the miRNA-dependent and the miRNA-independent recruitment of AGO2 to lower the stability of and suppress the translation of ACOT7 mRNA.


Asunto(s)
Proteínas Argonautas/fisiología , Proteínas Portadoras/fisiología , MicroARNs/fisiología , Proteínas Nucleares/fisiología , Interferencia de ARN , ARN Mensajero/metabolismo , Regiones no Traducidas 3' , Secuencia de Bases , Sitios de Unión , Factores Eucarióticos de Iniciación/metabolismo , Células HCT116 , Células HEK293 , Humanos , Secuencias Invertidas Repetidas , Células MCF-7 , Unión Proteica , Biosíntesis de Proteínas , Dominios Proteicos , Estabilidad del ARN , ARN Mensajero/genética , Proteínas de Unión al ARN
15.
BMC Syst Biol ; 11(Suppl 2): 16, 2017 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-28361677

RESUMEN

BACKGROUND: Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use. RESULTS: We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others. CONCLUSIONS: Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding .


Asunto(s)
Biología Computacional/métodos , Nucleótidos/metabolismo , Proteínas de Unión al ARN/metabolismo , ARN/química , ARN/metabolismo , Humanos , Máquina de Vectores de Soporte
16.
Data Brief ; 10: 561-563, 2017 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-28070546

RESUMEN

Despite the increasing number of protein-RNA complexes in structure databases, few data resources have been made available which can be readily used in developing or testing a method for predicting either protein-binding sites in RNA sequences or RNA-binding sites in protein sequences. The problem of predicting protein-binding sites in RNA has received much less attention than the problem of predicting RNA-binding sites in protein. The data presented in this paper are related to the article entitled "PRIdictor: Protein-RNA Interaction predictor" (Tuvshinjargal et al. 2016) [1]. PRIdictor can predict protein-binding sites in RNA as well as RNA-binding sites in protein at the nucleotide- and residue-levels. This paper presents four datasets that were used to test four prediction models of PRIdictor: (1) model RP for predicting protein-binding sites in RNA from protein and RNA sequences, (2) model RaP for predicting protein-binding sites in RNA from RNA sequence alone, (3) model PR for predicting RNA-binding sites in protein from protein and RNA sequences, and (4) model PaR for predicting RNA-binding sites in protein from protein sequence alone. The datasets supplied in this article can be used as a valuable resource to evaluate and compare different methods for predicting protein-RNA binding sites.

17.
Artículo en Inglés | MEDLINE | ID: mdl-29990126

RESUMEN

A transcription factor (TF) is a protein that regulates gene expression by binding to specific DNA sequences. Despite the recent advances in experimental techniques for identifying transcription factor binding sites (TFBS) in DNA sequences, a large number of TFBS are to be unveiled in many species. Several computational methods developed for predicting TFBS in DNA are tissue- or species-specific methods, so cannot be used without prior knowledge of tissue or species. Some computational methods are applicable to finding TFBS in short DNA sequences only. In this paper we propose a new learning method for predicting TFBS in DNA of any length using the composition, transition and distribution of nucleotides and amino acids in DNA and TF sequences. In independent testing of the method on datasets that were not used in training the method, its accuracy and MCC were as high as 81.84% and 0.634, respectively. The proposed method can be a useful aid for selecting potential TFBS in a large amount of DNA sequences before conducting biochemical experiments to empirically determine TFBS. The program and data sets are available at http://bclab.inha.ac.kr/TFbinding.

18.
Biosystems ; 139: 17-22, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26607710

RESUMEN

Several computational methods have been developed to predict RNA-binding sites in protein, but its inverse problem (i.e., predicting protein-binding sites in RNA) has received much less attention. Furthermore, most methods that predict RNA-binding sites in protein do not consider interaction partners of a protein. This paper presents a web server called PRIdictor (Protein-RNA Interaction predictor) which predicts mutual binding sites in RNA and protein at the nucleotide- and residue-level resolutions from their sequences. PRIdictor can be used as a web-based application or web service at http://bclab.inha.ac.kr/pridictor.


Asunto(s)
Aminoácidos/metabolismo , Proteínas/metabolismo , ARN/metabolismo , Ribonucleótidos/metabolismo , Máquina de Vectores de Soporte , Sitios de Unión , Biología Computacional
19.
Comput Methods Programs Biomed ; 120(1): 3-15, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25907142

RESUMEN

In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure.


Asunto(s)
Nucleótidos/química , ARN/química , Algoritmos , Secuencia de Bases , Sitios de Unión , Biología Computacional , Cristalografía por Rayos X , Bases de Datos de Proteínas , Datos de Secuencia Molecular , Valor Predictivo de las Pruebas , Unión Proteica , Conformación Proteica , Curva ROC , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Análisis de Secuencia de ARN , Máquina de Vectores de Soporte
20.
BMC Genomics ; 16 Suppl 3: S6, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25708089

RESUMEN

BACKGROUND: Interactions between DNA and proteins are essential to many biological processes such as transcriptional regulation and DNA replication. With the increased availability of structures of protein-DNA complexes, several computational studies have been conducted to predict DNA binding sites in proteins. However, little attempt has been made to predict protein binding sites in DNA. RESULTS: From an extensive analysis of protein-DNA complexes, we identified powerful features of DNA and protein sequences which can be used in predicting protein binding sites in DNA sequences. We developed two support vector machine (SVM) models that predict protein binding nucleotides from DNA and/or protein sequences. One SVM model that used DNA sequence data alone achieved a sensitivity of 73.4%, a specificity of 64.8%, an accuracy of 68.9% and a correlation coefficient of 0.382 with a test dataset that was not used in training. Another SVM model that used both DNA and protein sequences achieved a sensitivity of 67.6%, a specificity of 74.3%, an accuracy of 71.4% and a correlation coefficient of 0.418. CONCLUSIONS: Predicting binding sites in double-stranded DNAs is a more difficult task than predicting binding sites in single-stranded molecules. Our study showed that protein binding sites in double-stranded DNA molecules can be predicted with a comparable accuracy as those in single-stranded molecules. Our study also demonstrated that using both DNA and protein sequences resulted in a better prediction performance than using DNA sequence data alone. The SVM models and datasets constructed in this study are available at http://bclab.inha.ac.kr/pnimodeler.


Asunto(s)
Nucleótidos/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos , Biología Computacional , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Unión Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA