Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
1.
Comput Biol Med ; 167: 107586, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37907029

RESUMO

The associations between cancer and bacteria/fungi have been extensively studied, but the implications of cancer-associated viruses have not been thoroughly examined. In this study, we comprehensively characterized the cancer virome of tissue samples across 31 cancer types, as well as blood samples from 23 cancer types. Our findings demonstrated the presence of viral DNA at low abundances in both tissue and blood across major human cancers, with significant differences in viral community composition observed among various cancer types. Furthermore, Cox regression analyses conducted on four cancers, including Head and Neck squamous cell carcinoma (HNSC), Kidney renal clear cell carcinoma (KIRC), Stomach adenocarcinoma (STAD), and Uterine Corpus Endometrial Carcinoma (UCEC), revealed strong correlation between viral composition/abundance in tissues and patient survival. Additionally, we identified virus-associated prognostic signatures (VAPS) for these four cancers, and discerned differences in the interplay between VAPS and dominant bacteria in tissues among patients with varying survival risks. Notably, clinically relevant analyses revealed prognostic capacities of the VAPS in these four cancers. Taken together, our study provides novel insights into the role of viruses in tissue in the prognosis of multiple cancers and offers guidance on the use of tissue viruses to stratify prognosis for patients with cancer.


Assuntos
Adenocarcinoma , Carcinoma de Células Renais , Neoplasias Renais , Neoplasias Gástricas , Humanos
2.
Math Biosci Eng ; 20(1): 1037-1057, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36650801

RESUMO

DNase I hypersensitive sites (DHSs) are a specific genomic region, which is critical to detect or understand cis-regulatory elements. Although there are many methods developed to detect DHSs, there is a big gap in practice. We presented a deep learning-based language model for predicting DHSs, named LangMoDHS. The LangMoDHS mainly comprised the convolutional neural network (CNN), the bi-directional long short-term memory (Bi-LSTM) and the feed-forward attention. The CNN and the Bi-LSTM were stacked in a parallel manner, which was helpful to accumulate multiple-view representations from primary DNA sequences. We conducted 5-fold cross-validations and independent tests over 14 tissues and 4 developmental stages. The empirical experiments showed that the LangMoDHS is competitive with or slightly better than the iDHS-Deep, which is the latest method for predicting DHSs. The empirical experiments also implied substantial contribution of the CNN, Bi-LSTM, and attention to DHSs prediction. We implemented the LangMoDHS as a user-friendly web server which is accessible at http:/www.biolscience.cn/LangMoDHS/. We used indices related to information entropy to explore the sequence motif of DHSs. The analysis provided a certain insight into the DHSs.


Assuntos
Aprendizado Profundo , Animais , Camundongos , Desoxirribonuclease I/genética , Desoxirribonuclease I/metabolismo , Genômica , Sequências Reguladoras de Ácido Nucleico
3.
Front Microbiol ; 13: 1048478, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36560938

RESUMO

Transcription factors (TFs) are typical regulators for gene expression and play versatile roles in cellular processes. Since it is time-consuming, costly, and labor-intensive to detect it by using physical methods, it is desired to develop a computational method to detect TFs. Here, we presented a capsule network-based method for identifying TFs. This method is an end-to-end deep learning method, consisting mainly of an embedding layer, bidirectional long short-term memory (LSTM) layer, capsule network layer, and three fully connected layers. The presented method obtained an accuracy of 0.8820, being superior to the state-of-the-art methods. These empirical experiments showed that the inclusion of the capsule network promoted great performances and that the capsule network-based representation was superior to the property-based representation for distinguishing between TFs and non-TFs. We also implemented the presented method into a user-friendly web server, which is freely available at http://www.biolscience.cn/Capsule_TF/ for all scientific researchers.

4.
Front Genet ; 13: 1003711, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36568390

RESUMO

With the development of high-throughput sequencing technology, the scale of single-cell RNA sequencing (scRNA-seq) data has surged. Its data are typically high-dimensional, with high dropout noise and high sparsity. Therefore, gene imputation and cell clustering analysis of scRNA-seq data is increasingly important. Statistical or traditional machine learning methods are inefficient, and improved accuracy is needed. The methods based on deep learning cannot directly process non-Euclidean spatial data, such as cell diagrams. In this study, we developed scGAEGAT, a multi-modal model with graph autoencoders and graph attention networks for scRNA-seq analysis based on graph neural networks. Cosine similarity, median L1 distance, and root-mean-squared error were used to measure the gene imputation performance of different methods for comparison with scGAEGAT. Furthermore, adjusted mutual information, normalized mutual information, completeness score, and Silhouette coefficient score were used to measure the cell clustering performance of different methods for comparison with scGAEGAT. Experimental results demonstrated promising performance of the scGAEGAT model in gene imputation and cell clustering prediction on four scRNA-seq data sets with gold-standard cell labels.

5.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36242564

RESUMO

Breast cancer patients often have recurrence and metastasis after surgery. Predicting the risk of recurrence and metastasis for a breast cancer patient is essential for the development of precision treatment. In this study, we proposed a novel multi-modal deep learning prediction model by integrating hematoxylin & eosin (H&E)-stained histopathological images, clinical information and gene expression data. Specifically, we segmented tumor regions in H&E into image blocks (256 × 256 pixels) and encoded each image block into a 1D feature vector using a deep neural network. Then, the attention module scored each area of the H&E-stained images and combined image features with clinical and gene expression data to predict the risk of recurrence and metastasis for each patient. To test the model, we downloaded all 196 breast cancer samples from the Cancer Genome Atlas with clinical, gene expression and H&E information simultaneously available. The samples were then divided into the training and testing sets with a ratio of 7: 3, in which the distributions of the samples were kept between the two datasets by hierarchical sampling. The multi-modal model achieved an area-under-the-curve value of 0.75 on the testing set better than those based solely on H&E image, sequencing data and clinical data, respectively. This study might have clinical significance in identifying high-risk breast cancer patients, who may benefit from postoperative adjuvant treatment.


Assuntos
Neoplasias da Mama , Aprendizado Profundo , Humanos , Feminino , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Redes Neurais de Computação , Amarelo de Eosina-(YS) , Expressão Gênica
6.
Sci Rep ; 12(1): 13996, 2022 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-35978023

RESUMO

Deep learning technology is changing the landscape of cybersecurity research, especially the study of large amounts of data. With the rapid growth in the number of malware, developing of an efficient and reliable method for classifying malware has become one of the research priorities. In this paper, a new method, BIR-CNN, is proposed to classify of Android malware. It combines convolution neural network (CNN) with batch normalization and inception-residual (BIR) network modules by using 347-dim network traffic features. CNN combines inception-residual modules with a convolution layer that can enhance the learning ability of the model. Batch Normalization can speed up the training process and avoid over-fitting of the model. Finally, experiments are conducted on the publicly available network traffic dataset CICAndMal2017 and compared with three traditional machine learning algorithms and CNN. The accuracy of BIR-CNN is 99.73% in binary classification (2-classifier). Moreover, the BIR-CNN can classify malware by its category (4-classifier) and malicious family (35-classifier), with a classification accuracy of 99.53% and 94.38%, respectively. The experimental results show that the proposed model is an effective method for Android malware classification, especially in malware category and family classifier.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Algoritmos , Segurança Computacional , Coleta de Dados
7.
Biomolecules ; 12(7)2022 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-35883552

RESUMO

Enhancers are short DNA segments that play a key role in biological processes, such as accelerating transcription of target genes. Since the enhancer resides anywhere in a genome sequence, it is difficult to precisely identify enhancers. We presented a bi-directional long-short term memory (Bi-LSTM) and attention-based deep learning method (Enhancer-LSTMAtt) for enhancer recognition. Enhancer-LSTMAtt is an end-to-end deep learning model that consists mainly of deep residual neural network, Bi-LSTM, and feed-forward attention. We extensively compared the Enhancer-LSTMAtt with 19 state-of-the-art methods by 5-fold cross validation, 10-fold cross validation and independent test. Enhancer-LSTMAtt achieved competitive performances, especially in the independent test. We realized Enhancer-LSTMAtt into a user-friendly web application. Enhancer-LSTMAtt is applicable not only to recognizing enhancers, but also to distinguishing strong enhancer from weak enhancers. Enhancer-LSTMAtt is believed to become a promising tool for identifying enhancers.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação
8.
Pharmaceuticals (Basel) ; 15(6)2022 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-35745625

RESUMO

Bioactive peptides are typically small functional peptides with 2-20 amino acid residues and play versatile roles in metabolic and biological processes. Bioactive peptides are multi-functional, so it is vastly challenging to accurately detect all their functions simultaneously. We proposed a convolution neural network (CNN) and bi-directional long short-term memory (Bi-LSTM)-based deep learning method (called MPMABP) for recognizing multi-activities of bioactive peptides. The MPMABP stacked five CNNs at different scales, and used the residual network to preserve the information from loss. The empirical results showed that the MPMABP is superior to the state-of-the-art methods. Analysis on the distribution of amino acids indicated that the lysine preferred to appear in the anti-cancer peptide, the leucine in the anti-diabetic peptide, and the proline in the anti-hypertensive peptide. The method and analysis are beneficial to recognize multi-activities of bioactive peptides.

9.
Comput Biol Chem ; 98: 107689, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35537363

RESUMO

The embryonic stem cell (ESC) has the capacity to self-renew and maintain pluripotent, while continuously offering a source of various differentiated cell types. The fate decision process of remaining in the ground state or transiting to a differentiated state can be read out by the regulatory network of key transcription factors (TFs). However, its underlying mechanism remains to be fully elucidated. In this paper, we tackle this problem by proposing a novel cellular differentiation model for mouse embryonic stem cell (MESC) dynamics regulation: MESC-DRM. We employ nonlinear least-squares algorithm to infer model parameters by using benchmark datasets, construct a potential function by exploiting multivariate Gaussian distributions, and project the potential landscape into a 3D space to validate and replicate the stable cell states observed in experiments. The traditional cell landscape modeling techniques rely on the potential function visualization to decide the stable states of cells. But the visualization will be almost impossible when the dimensionality of the potential function is greater than 3. We handle the challenge by innovatively employing a Lyapunov method to resolve it through a more straightforward analytical approach. It also provides a more rigorous and robust way for accurate cell fate decision. The study not only validates the previous experimental results but also provides an insightful guide for cell fate decision besides inspiring future study on this topic.


Assuntos
Algoritmos , Células-Tronco Embrionárias , Animais , Diferenciação Celular , Camundongos
10.
Comb Chem High Throughput Screen ; 25(3): 381-391, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33045963

RESUMO

AIM AND OBJECTIVE: The similarities comparison of biological sequences is an important task in bioinformatics. The methods of the similarities comparison for biological sequences are divided into two classes: sequence alignment method and alignment-free method. The graphical representation of biological sequences is a kind of alignment-free method, which constitutes a tool for analyzing and visualizing the biological sequences. In this article, a generalized iterative map of protein sequences was suggested to analyze the similarities of biological sequences. MATERIALS AND METHODS: Based on the normalized physicochemical indexes of 20 amino acids, each amino acid can be mapped into a point in 5D space. A generalized iterative function system was introduced to outline a generalized iterative map of protein sequences, which can not only reflect various physicochemical properties of amino acids but also incorporate with different compression ratios of the component of a generalized iterative map. Several properties were proved to illustrate the advantage of the generalized iterative map. The mathematical description of the generalized iterative map was suggested to compare the similarities and dissimilarities of protein sequences. Based on this method, similarities/dissimilarities were compared among ND5 protein sequences, as well as ND6 protein sequences of ten different species. RESULTS: By correlation analysis, the ClustalW results were compared with our similarity/dissimilarity results and other graphical representation results to show the utility of our approach. The comparison results show that our approach has better correlations with ClustalW for all species than other approaches and illustrate the effectiveness of our approach. CONCLUSION: Two examples show that our method not only has good performances and effects in the similarity/dissimilarity analysis of protein sequences but also does not require complex computation.


Assuntos
Proteínas , Análise de Sequência de Proteína , Algoritmos , Sequência de Aminoácidos , Biologia Computacional/métodos , Proteínas/química , Alinhamento de Sequência , Análise de Sequência de Proteína/métodos
11.
Front Genet ; 12: 712170, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34490041

RESUMO

Studies have found that long non-coding RNAs (lncRNAs) play important roles in many human biological processes, and it is critical to explore potential lncRNA-disease associations, especially cancer-associated lncRNAs. However, traditional biological experiments are costly and time-consuming, so it is of great significance to develop effective computational models. We developed a random walk algorithm with restart on multiplex and heterogeneous networks of lncRNAs and diseases to predict lncRNA-disease associations (MHRWRLDA). First, multiple disease similarity networks are constructed by using different approaches to calculate similarity scores between diseases, and multiple lncRNA similarity networks are also constructed by using different approaches to calculate similarity scores between lncRNAs. Then, a multiplex and heterogeneous network was constructed by integrating multiple disease similarity networks and multiple lncRNA similarity networks with the lncRNA-disease associations, and a random walk with restart on the multiplex and heterogeneous network was performed to predict lncRNA-disease associations. The results of Leave-One-Out cross-validation (LOOCV) showed that the value of Area under the curve (AUC) was 0.68736, which was improved compared with the classical algorithm in recent years. Finally, we confirmed a few novel predicted lncRNAs associated with specific diseases like colon cancer by literature mining. In summary, MHRWRLDA contributes to predict lncRNA-disease associations.

12.
Spectrochim Acta A Mol Biomol Spectrosc ; 241: 118685, 2020 Nov 05.
Artigo em Inglês | MEDLINE | ID: mdl-32653821

RESUMO

Two fluorescent probes were designed by connecting indomethacin to coumarin through different linkers. The introduction of indomethacin quenched the fluorescence of coumarin-based probes with apparent red-shifts in the absorption and emission maxima, probably due to the photoinduced electron transfer (PET) from the indomethacin to the fluorophore and the formation of folding conformation. The addition of human serum albumin (HSA) triggered about 40-fold fluorescence enhancements of ADC-IMC-2 and ADC-IMC-6 with 85 nm blue-shifts. The probe with longer spacer ADC-IMC-6 exhibited ratiometric fluorescent response toward HSA, and that with shorter linker showed "off-on" fluorescence response to HSA. However, insignificant spectral changes of the reference compounds (ADC-6 and ADC-2) initiated by HSA implied that indomethacin played critical role in the identification of HSA. The competitive assays and molecular docking results reveal that the indomethacin in ADC-IMC-6 could tightly combine at drug site I of HSA. Fluorescence bio-imaging experiments show that both probes could distinguish cancer cells from normal cells.


Assuntos
Corantes Fluorescentes , Neoplasias , Humanos , Indometacina , Simulação de Acoplamento Molecular , Neoplasias/diagnóstico por imagem , Neoplasias/tratamento farmacológico , Ligação Proteica , Albumina Sérica Humana , Espectrometria de Fluorescência
13.
Comput Math Methods Med ; 2020: 3974598, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32328150

RESUMO

The type III secretion system (T3SS) is a special protein delivery system in Gram-negative bacteria which delivers T3SS-secreted effectors (T3SEs) to host cells causing pathological changes. Numerous experiments have verified that T3SEs play important roles in many biological activities and in host-pathogen interactions. Accurate identification of T3SEs is therefore essential to help understand the pathogenic mechanism of bacteria; however, many existing biological experimental methods are time-consuming and expensive. New deep-learning methods have recently been successfully applied to T3SE recognition, but improving the recognition accuracy of T3SEs is still a challenge. In this study, we developed a new deep-learning framework, ACNNT3, based on the attention mechanism. We converted 100 residues of the N-terminal of the protein sequence into a fusion feature vector of protein primary structure information (one-hot encoding) and position-specific scoring matrix (PSSM) which are used as the feature input of the network model. We then embedded the attention layer into CNN to learn the characteristic preferences of type III effector proteins, which can accurately classify any protein directly as either T3SEs or non-T3SEs. We found that the introduction of new protein features can improve the recognition accuracy of the model. Our method combines the advantages of CNN and the attention mechanism and is superior in many indicators when compared to other popular methods. Using the common independent dataset, our method is more accurate than the previous method, showing an improvement of 4.1-20.0%.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Sistemas de Secreção Tipo III/genética , Sequência de Aminoácidos , Proteínas de Bactérias/genética , Proteínas de Bactérias/fisiologia , Biologia Computacional , Bases de Dados de Proteínas/estatística & dados numéricos , Bactérias Gram-Negativas/genética , Bactérias Gram-Negativas/patogenicidade , Bactérias Gram-Negativas/fisiologia , Interações entre Hospedeiro e Microrganismos/genética , Interações entre Hospedeiro e Microrganismos/fisiologia , Sistemas de Secreção Tipo III/fisiologia
14.
Biomed Res Int ; 2020: 7584968, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32337273

RESUMO

Residue-residue contact prediction has become an increasingly important tool for modeling the three-dimensional structure of a protein when no homologous structure is available. Ultradeep residual neural network (ResNet) has become the most popular method for making contact predictions because it captures the contextual information between residues. In this paper, we propose a novel deep neural network framework for contact prediction which combines ResNet and DenseNet. This framework uses 1D ResNet to process sequential features, and besides PSSM, SS3, and solvent accessibility, we have introduced a new feature, position-specific frequency matrix (PSFM), as an input. Using ResNet's residual module and identity mapping, it can effectively process sequential features after which the outer concatenation function is used for sequential and pairwise features. Prediction accuracy is improved following a final processing step using the dense connection of DenseNet. The prediction accuracy of the protein contact map shows that our method is more effective than other popular methods due to the new network architecture and the added feature input.


Assuntos
Proteínas/química , Algoritmos , Biologia Computacional/métodos , Redes Neurais de Computação , Conformação Proteica
15.
Curr Drug Metab ; 20(3): 236-243, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30657038

RESUMO

BACKGROUND: Some studies have shown that Human Papillomavirus (HPV) is strongly associated with cervical cancer. As we all know, cervical cancer still remains the fourth most common cancer, affecting women worldwide. Thus, it is both challenging and essential to detect risk types of human papillomaviruses. METHODS: In order to discriminate whether HPV type is highly risky or not, many epidemiological and experimental methods have been proposed recently. For HPV risk type prediction, there also have been a few computational studies which are all based on Machine Learning (ML) techniques, but adopt different feature extraction methods. Therefore, we conclude and discuss several classical approaches which have got a better result for the risk type prediction of HPV. RESULTS: This review summarizes the common methods to detect human papillomavirus. The main methods are sequence- derived features, text-based classification, gap-kernel method, ensemble SVM, Word statistical model, position- specific statistical model and mismatch kernel method (SVM). Among these methods, position-specific statistical model get a relatively high accuracy rate (accuracy=97.18%). Word statistical model is also a novel approach, which extracted the information of HPV from the protein "sequence space" with word statistical model to predict high-risk types of HPVs (accuracy=95.59%). These methods could potentially be used to improve prediction of highrisk types of HPVs. CONCLUSION: From the prediction accuracy, we get that the classification results are more accurate by establishing mathematical models. Thus, adopting mathematical methods to predict risk type of HPV will be the main goal of research in the future.


Assuntos
Papillomaviridae/classificação , Algoritmos , Bases de Dados Factuais , Feminino , Humanos , Modelos Estatísticos , Infecções por Papillomavirus , Risco , Neoplasias do Colo do Útero
16.
Int J Mol Sci ; 20(2)2019 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-30641858

RESUMO

As a common malignant tumor disease, thyroid cancer lacks effective preventive and therapeutic drugs. Thus, it is crucial to provide an effective drug selection method for thyroid cancer patients. The connectivity map (CMAP) project provides an experimental validated strategy to repurpose and optimize cancer drugs, the rationale behind which is to select drugs to reverse the gene expression variations induced by cancer. However, it has a few limitations. Firstly, CMAP was performed on cell lines, which are usually different from human tissues. Secondly, only gene expression information was considered, while the information about gene regulations and modules/pathways was more or less ignored. In this study, we first measured comprehensively the perturbations of thyroid cancer on a patient including variations at gene expression level, gene co-expression level and gene module level. After that, we provided a drug selection pipeline to reverse the perturbations based on drug signatures derived from tissue studies. We applied the analyses pipeline to the cancer genome atlas (TCGA) thyroid cancer data consisting of 56 normal and 500 cancer samples. As a result, we obtained 812 up-regulated and 213 down-regulated genes, whose functions are significantly enriched in extracellular matrix and receptor localization to synapses. In addition, a total of 33,778 significant differentiated co-expressed gene pairs were found, which form a larger module associated with impaired immune function and low immunity. Finally, we predicted drugs and gene perturbations that could reverse the gene expression and co-expression changes incurred by the development of thyroid cancer through the Fisher's exact test. Top predicted drugs included validated drugs like baclofen, nevirapine, glucocorticoid, formaldehyde and so on. Combining our analyses with literature mining, we inferred that the regulation of thyroid hormone secretion might be closely related to the inhibition of the proliferation of thyroid cancer cells.


Assuntos
Antineoplásicos/farmacologia , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/efeitos dos fármacos , Neoplasias da Glândula Tireoide/tratamento farmacológico , Antineoplásicos/uso terapêutico , Biologia Computacional , Mineração de Dados , Reposicionamento de Medicamentos , Matriz Extracelular/genética , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Humanos , Modelos Teóricos , Sinapses/genética , Neoplasias da Glândula Tireoide/genética
17.
BMC Bioinformatics ; 20(Suppl 22): 719, 2019 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-31888447

RESUMO

BACKGROUND: Subcellular localization prediction of protein is an important component of bioinformatics, which has great importance for drug design and other applications. A multitude of computational tools for proteins subcellular location have been developed in the recent decades, however, existing methods differ in the protein sequence representation techniques and classification algorithms adopted. RESULTS: In this paper, we firstly introduce two kinds of protein sequences encoding schemes: dipeptide information with space and Gapped k-mer information. Then, the Gapped k-mer calculation method which is based on quad-tree is also introduced. CONCLUSIONS: >From the prediction results, this method not only reduces the dimension, but also improves the prediction precision of protein subcellular localization.


Assuntos
Algoritmos , Biologia Computacional/métodos , Armazenamento e Recuperação da Informação/métodos , Proteínas/química , Frações Subcelulares/metabolismo , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dipeptídeos/química , Máquina de Vetores de Suporte
18.
Front Genet ; 9: 411, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30459803

RESUMO

In recent years, it has been increasingly clear that long noncoding RNAs (lncRNAs) play critical roles in many biological processes associated with human diseases. Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases, develop novel drugs, and optimize personalized treatments. However, biological experiments to validate lncRNA-disease associations are very time-consuming and costly. Thus, it is critical to develop effective computational models. In this study, we have proposed a method called BPLLDA to predict lncRNA-disease associations based on paths of fixed lengths in a heterogeneous lncRNA-disease association network. Specifically, BPLLDA first constructs a heterogeneous lncRNA-disease network by integrating the lncRNA-disease association network, the lncRNA functional similarity network, and the disease semantic similarity network. It then infers the probability of an lncRNA-disease association based on paths connecting them and their lengths in the network. Compared to existing methods, BPLLDA has a few advantages, including not demanding negative samples and the ability to predict associations related to novel lncRNAs or novel diseases. BPLLDA was applied to a canonical lncRNA-disease association database called LncRNADisease, together with two popular methods LRLSLDA and GrwLDA. The leave-one-out cross-validation areas under the receiver operating characteristic curve of BPLLDA are 0.87117, 0.82403, and 0.78528, respectively, for predicting overall associations, associations related to novel lncRNAs, and associations related to novel diseases, higher than those of the two compared methods. In addition, cervical cancer, glioma, and non-small-cell lung cancer were selected as case studies, for which the predicted top five lncRNA-disease associations were verified by recently published literature. In summary, BPLLDA exhibits good performances in predicting novel lncRNA-disease associations and associations related to novel lncRNAs and diseases. It may contribute to the understanding of lncRNA-associated diseases like certain cancers.

19.
Talanta ; 189: 429-436, 2018 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-30086942

RESUMO

A near infrared fluorescent probe YSP for sulfite was synthesized, in which a julolidine fused with a pyran-2-one was employed as the fluorophore and the vinyl activated by an indole salt as the receptor. The introduction of julolidine and indole salt strengthens the electron push-pull effect of the probe and allows it to absorb (597 nm) and emit (681 nm) in red wavelength region. The addition of sulfite to the C˭C bond led to prominent blue-shifts in both absorption (171 nm) and emission (165 nm) spectra, which made it possible for colorimetric and ratiometric fluorescent detection of sulfite. NMR titration results illustrated that the determination of sulfite is a two-step process: nucleophilic addition of sulfite to the unsaturated carbon of C˭N in indole ring followed by intramolecular rearrangement through a four-membered ring to form adduct-B with shorter absorption wavelength. In addition, the cationic feature of YSP enables the probe to be specifically localized in mitochondria, and it could ratiometric bioimaging sulfite in living HepG-2 and L929 cells.


Assuntos
Corantes Fluorescentes/metabolismo , Raios Infravermelhos , Mitocôndrias/metabolismo , Sulfitos/análise , Sulfitos/metabolismo , Água/química , Sobrevivência Celular , Corantes Fluorescentes/química , Células Hep G2 , Humanos , Concentração de Íons de Hidrogênio , Modelos Moleculares , Conformação Molecular , Espectrometria de Fluorescência
20.
Evol Bioinform Online ; 14: 1176934318777755, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29977111

RESUMO

In this article, we propose a 3-dimensional graphical representation of protein sequences based on 10 physicochemical properties of 20 amino acids and the BLOSUM62 matrix. It contains evolutionary information and provides intuitive visualization. To further analyze the similarity of proteins, we extract a specific vector from the graphical representation curve. The vector is used to calculate the similarity distance between 2 protein sequences. To prove the effectiveness of our approach, we apply it to 3 real data sets. The results are consistent with the known evolution fact and show that our method is effective in phylogenetic analysis.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA