Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37615358

RESUMEN

Non-coding RNA (ncRNA) plays a critical role in biology. ncRNAs from the same family usually have similar functions, as a result, it is essential to predict ncRNA families before identifying their functions. There are two primary methods for predicting ncRNA families, namely, traditional biological methods and computational methods. In traditional biological methods, a lot of manpower and resources are required to predict ncRNA families. Therefore, this paper proposed a new ncRNA family prediction method called MFPred based on computational methods. MFPred identified ncRNA families by extracting sequence features of ncRNAs, and it possessed three primary modules, including (1) four ncRNA sequences encoding and feature extraction module, which encoded ncRNA sequences and extracted four different features of ncRNA sequences, (2) dynamic Bi_GRU and feature fusion module, which extracted contextual information features of the ncRNA sequence and (3) ResNet_SE module that extracted local information features of the ncRNA sequence. In this study, MFPred was compared with the previously proposed ncRNA family prediction methods using two frequently used public ncRNA datasets, NCY and nRC. The results showed that MFPred outperformed other prediction methods in the two datasets.


Asunto(s)
Biología Computacional , ARN no Traducido , Humanos , Biología Computacional/métodos , ARN no Traducido/genética
2.
BMC Bioinformatics ; 25(1): 133, 2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38539106

RESUMEN

Cancer is one of the leading causes of deaths worldwide. Survival analysis and prediction of cancer patients is of great significance for their precision medicine. The robustness and interpretability of the survival prediction models are important, where robustness tells whether a model has learned the knowledge, and interpretability means if a model can show human what it has learned. In this paper, we propose a robust and interpretable model SurvConvMixer, which uses pathways customized gene expression images and ConvMixer for cancer short-term, mid-term and long-term overall survival prediction. With ConvMixer, the representation of each pathway can be learned respectively. We show the robustness of our model by testing the trained model on absolutely untrained external datasets. The interpretability of SurvConvMixer depends on gradient-weighted class activation mapping (Grad-Cam), by which we can obtain the pathway-level activation heat map. Then wilcoxon rank-sum tests are conducted to obtain the statistically significant pathways, thereby revealing which pathways the model focuses on more. SurvConvMixer achieves remarkable performance on the short-term, mid-term and long-term overall survival of lung adenocarcinoma, lung squamous cell carcinoma and skin cutaneous melanoma, and the external validation tests show that SurvConvMixer can generalize to external datasets so that it is robust. Finally, we investigate the activation maps generated by Grad-Cam, after wilcoxon rank-sum test and Kaplan-Meier estimation, we find that some survival-related pathways play important role in SurvConvMixer.


Asunto(s)
Adenocarcinoma del Pulmón , Neoplasias Pulmonares , Melanoma , Neoplasias Cutáneas , Humanos , Expresión Génica
3.
Beilstein J Org Chem ; 20: 852-858, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38655555

RESUMEN

We confirm the previously revised stereochemistry of spiroviolene by X-ray crystallographically characterizing a hydrazone derivative of 9-oxospiroviolane, which is synthesized by hydroboration/oxidation of spiroviolene followed by oxidation of the resultant hydroxy group. An unexpected thermal boron migration occurred during the hydroboration process of spiroviolene that resulted in the production of a mixture of 1α-hydroxyspiroviolane, 9α- and 9ß-hydroxyspiroviolane after oxidation. The assertion of the cis-orientation of the 19- and 20-methyl groups provided further support for the revised cyclization mechanism of spiroviolene.

4.
BMC Bioinformatics ; 24(1): 68, 2023 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-36849908

RESUMEN

BACKGROUND: Although research on non-coding RNAs (ncRNAs) is a hot topic in life sciences, the functions of numerous ncRNAs remain unclear. In recent years, researchers have found that ncRNAs of the same family have similar functions, therefore, it is important to accurately predict ncRNAs families to identify their functions. There are several methods available to solve the prediction problem of ncRNAs family, whose main ideas can be divided into two categories, including prediction based on the secondary structure features of ncRNAs, and prediction according to sequence features of ncRNAs. The first type of prediction method requires a complicated process and has a low accuracy in obtaining the secondary structure of ncRNAs, while the second type of method has a simple prediction process and a high accuracy, but there is still room for improvement. The existing methods for ncRNAs family prediction are associated with problems such as complicated prediction processes and low accuracy, in this regard, it is necessary to propose a new method to predict the ncRNAs family more perfectly. RESULTS: A deep learning model-based method, ncDENSE, was proposed in this study, which predicted ncRNAs families by extracting ncRNAs sequence features. The bases in ncRNAs sequences were encoded by one-hot coding and later fed into an ensemble deep learning model, which contained the dynamic bi-directional gated recurrent unit (Bi-GRU), the dense convolutional network (DenseNet), and the Attention Mechanism (AM). To be specific, dynamic Bi-GRU was used to extract contextual feature information and capture long-term dependencies of ncRNAs sequences. AM was employed to assign different weights to features extracted by Bi-GRU and focused the attention on information with greater weights. Whereas DenseNet was adopted to extract local feature information of ncRNAs sequences and classify them by the full connection layer. According to our results, the ncDENSE method improved the Accuracy, Sensitivity, Precision, F-score, and MCC by 2.08[Formula: see text], 2.33[Formula: see text], 2.14[Formula: see text], 2.16[Formula: see text], and 2.39[Formula: see text], respectively, compared with the suboptimal method. CONCLUSIONS: Overall, the ncDENSE method proposed in this paper extracts sequence features of ncRNAs by dynamic Bi-GRU and DenseNet and improves the accuracy in predicting ncRNAs family and other data.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Aprendizaje Profundo , Humanos , ARN no Traducido/genética
5.
BMC Bioinformatics ; 24(1): 353, 2023 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-37730567

RESUMEN

OBJECTIVE: Breast cancer is a significant health issue for women, and human epidermal growth factor receptor-2 (HER2) plays a crucial role as a vital prognostic and predictive factor. The HER2 status is essential for formulating effective treatment plans for breast cancer. However, the assessment of HER2 status using immunohistochemistry (IHC) is time-consuming and costly. Existing computational methods for evaluating HER2 status have limitations and lack sufficient accuracy. Therefore, there is an urgent need for an improved computational method to better assess HER2 status, which holds significant importance in saving lives and alleviating the burden on pathologists. RESULTS: This paper analyzes the characteristics of histological images of breast cancer and proposes a neural network model named HAHNet that combines multi-scale features with attention mechanisms for HER2 status classification. HAHNet directly classifies the HER2 status from hematoxylin and eosin (H&E) stained histological images, reducing additional costs. It achieves superior performance compared to other computational methods. CONCLUSIONS: According to our experimental results, the proposed HAHNet achieved high performance in classifying the HER2 status of breast cancer using only H&E stained samples. It can be applied in case classification, benefiting the work of pathologists and potentially helping more breast cancer patients.


Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Eosina Amarillenta-(YS) , Redes Neurales de la Computación , Coloración y Etiquetado
6.
Methods ; 204: 368-375, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35490852

RESUMEN

Access to RNA secondary structure is a prerequisite for understanding and mastering RNA function. RNA secondary structures play an important role in cells, they can cause or contribute to neurological disorders and can be applied in the medical field. However, the experimental method to obtain RNA secondary structure is costly, laborious and not universal. Although computational methods can predict RNA secondary structure more accurately for short-sequence RNAs, it cannot predict long-sequence RNAs and pseudoknot, which is the bottleneck of RNA secondary structure prediction at present. In recent years, researchers have attempted to use deep learning algorithms to predict RNA secondary structure and have achieved results. However, the small amount of data on the secondary structure of long-sequence RNAs leads to the low accuracy of deep learning methods to predict the secondary structure of RNAs across races. Similarly, RNA structure with pseudoknot is very complex and insufficient data caused the deep learning algorithm to struggle to predict the secondary structure of RNA containing pseudoknots. The RNA data are encoded into grayscale images by a unique encoding method based on the real RNA secondary structure and sequence information. Then, this paper reasonably expands the image data to increase the amount of RNA data, solves the problem of insufficient data for predicting long sequences and RNA secondary structure with pseudoknots in current deep learning methods, and provides a good data foundation for deep learning.The article proposes a multi-scale feature fusion Conditional Deep Convolutional Generative Adversarial Network prediction model (MSFF-CDCGAN) based on the improved Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) model to predict RNA secondary structure. The experimental results showed that the MSFF-CDCGAN model could predict long-sequence RNAs and pseudoknots more accurately than traditional prediction methods. This paper introduces Generative Adversarial Network (GAN) to RNA secondary structure prediction for the first time. It uses a unique image encoding approach to expand the original RNA data set, thus transforming the structure prediction problem into an image analysis problem and effectively solving the bottleneck in RNA secondary structure prediction.


Asunto(s)
Algoritmos , ARN , Procesamiento de Imagen Asistido por Computador , Estructura Secundaria de Proteína , ARN/química , ARN/genética , Análisis de Secuencia de ARN/métodos
7.
BMC Bioinformatics ; 23(1): 354, 2022 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-35999499

RESUMEN

BACKGROUND: RNA secondary structure is very important for deciphering cell's activity and disease occurrence. The first method which was used by the academics to predict this structure is biological experiment, But this method is too expensive, causing the promotion to be affected. Then, computing methods emerged, which has good efficiency and low cost. However, the accuracy of computing methods are not satisfactory. Many machine learning methods have also been applied to this area, but the accuracy has not improved significantly. Deep learning has matured and achieves great success in many areas such as computer vision and natural language processing. It uses neural network which is a kind of structure that has good functionality and versatility, but its effect is highly correlated with the quantity and quality of the data. At present, there is no model with high accuracy, low data dependence and high convenience in predicting RNA secondary structure. RESULTS: This paper designs a neural network called LTPConstraint to predict RNA secondary structure. The network is based on many network structure such as Bidirectional LSTM, Transformer and generator. It also uses transfer learning to train modelso that the data dependence can be reduced. CONCLUSIONS: LTPConstraint has achieved high accuracy in RNA secondary structure prediction. Compared with the previous methods, the accuracy improves obviously both in predicting the structure with pseudoknot and the structure without pseudoknot. At the same time, LTPConstraint is easy to operate and can achieve result very quickly.


Asunto(s)
Redes Neurales de la Computación , ARN , Aprendizaje Automático , Estructura Secundaria de Proteína , ARN/química
8.
BMC Infect Dis ; 22(1): 490, 2022 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-35606725

RESUMEN

BACKGROUND: Tuberculosis (TB) is the respiratory infectious disease with the highest incidence in China. We aim to design a series of forecasting models and find the factors that affect the incidence of TB, thereby improving the accuracy of the incidence prediction. RESULTS: In this paper, we developed a new interpretable prediction system based on the multivariate multi-step Long Short-Term Memory (LSTM) model and SHapley Additive exPlanation (SHAP) method. Four accuracy measures are introduced into the system: Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, and symmetric Mean Absolute Percentage Error. The Autoregressive Integrated Moving Average (ARIMA) model and seasonal ARIMA model are established. The multi-step ARIMA-LSTM model is proposed for the first time to examine the performance of each model in the short, medium, and long term, respectively. Compared with the ARIMA model, each error of the multivariate 2-step LSTM model is reduced by 12.92%, 15.94%, 15.97%, and 14.81% in the short term. The 3-step ARIMA-LSTM model achieved excellent performance, with each error decreased to 15.19%, 33.14%, 36.79%, and 29.76% in the medium and long term. We provide the local and global explanation of the multivariate single-step LSTM model in the field of incidence prediction, pioneering. CONCLUSIONS: The multivariate 2-step LSTM model is suitable for short-term prediction and obtained a similar performance as previous studies. The 3-step ARIMA-LSTM model is appropriate for medium-to-long-term prediction and outperforms these models. The SHAP results indicate that the five most crucial features are maximum temperature, average relative humidity, local financial budget, monthly sunshine percentage, and sunshine hours.


Asunto(s)
Tuberculosis , China/epidemiología , Predicción , Humanos , Incidencia , Modelos Estadísticos , Temperatura , Tuberculosis/epidemiología
9.
Bioorg Chem ; 128: 106060, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35926428

RESUMEN

Fourteen phenolic constituents, notopheninetols A-E (1-5), notoflavinols A and B (6 and 7), and (2R)-5,4'-dihydroxy-7-O-[(E)-3,7-dimethyl-2,6-octadienyl]flavanone (8a), along with 12 known analogues (8b and 9-19) were isolated from the roots and rhizomes of Notopterygium incisum. Compounds 1-4 and 6-8 were seven pairs of enantiomers, and they were separated by chiral HPLC to obtain the optically pure compounds. The structures of the new compounds were elucidated based on detailed analyses of 1D and 2D NMR and HRESIMS data, and the absolute configurations were determined by quantum chemical calculations of the electronic circular dichroism (ECD) spectra, comparison of the experimental ECD data with those reported, and chemical methods. Compounds 1 and 2 possessed a 1-benzyl-2-methyl-indane skeleton, which was unprecedented in natural source. All of the isolated compounds were evaluated for their nitric oxide (NO) inhibitory effects on RAW264.7 cells induced by LPS, and compounds 6a/6b, 7a, 8a/8b, and the hydrogenated products 6'a and 7'a showed moderate inhibitory activities with IC50 values in the range of 6.2-20.6 µM. Moreover, the interactions of these bioactive compounds with inducible nitric oxide synthase (iNOS) were explored by employing molecular docking simulation.


Asunto(s)
Apiaceae , Rizoma , Apiaceae/química , Simulación del Acoplamiento Molecular , Estructura Molecular , Óxido Nítrico/análisis , Raíces de Plantas/química , Rizoma/química
10.
Entropy (Basel) ; 24(9)2022 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-36141162

RESUMEN

Precise iris segmentation is a very important part of accurate iris recognition. Traditional iris segmentation methods require complex prior knowledge and pre- and post-processing and have limited accuracy under non-ideal conditions. Deep learning approaches outperform traditional methods. However, the limitation of a small number of labeled datasets degrades their performance drastically because of the difficulty in collecting and labeling irises. Furthermore, previous approaches ignore the large distribution gap within the non-ideal iris dataset due to illumination, motion blur, squinting eyes, etc. To address these issues, we propose a three-stage training strategy. Firstly, supervised contrastive pretraining is proposed to increase intra-class compactness and inter-class separability to obtain a good pixel classifier under a limited amount of data. Secondly, the entire network is fine-tuned using cross-entropy loss. Thirdly, an intra-dataset adversarial adaptation is proposed, which reduces the intra-dataset gap in the non-ideal situation by aligning the distribution of the hard and easy samples at the pixel class level. Our experiments show that our method improved the segmentation performance and achieved the following encouraging results: 0.44%, 1.03%, 0.66%, 0.41%, and 0.37% in the Nice1 and 96.66%, 98.72%, 93.21%, 94.28%, and 97.41% in the F1 for UBIRIS.V2, IITD, MICHE-I, CASIA-D, and CASIA-T.

11.
J Comput Sci Technol ; 37(4): 991-1002, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35992496

RESUMEN

First discovered in Wuhan, China, SARS-CoV-2 is a highly pathogenic novel coronavirus, which rapidly spread globally and became a pandemic with no vaccine and limited distinctive clinical drugs available till March 13th, 2020. Ribonucleic Acid interference (RNAi) technology, a gene-silencing technology that targets mRNA, can cause damage to RNA viruses effectively. Here, we report a new efficient small interfering RNA (siRNA) design method named Simple Multiple Rules Intelligent Method (SMRI) to propose a new solution of the treatment of COVID-19. To be specific, this study proposes a new model named Base Preference and Thermodynamic Characteristic model (BPTC model) indicating the siRNA silencing efficiency and a new index named siRNA Extended Rules index (SER index) based on the BPTC model to screen high-efficiency siRNAs and filter out the siRNAs that are difficult to take effect or synthesize as a part of the SMRI method, which is more robust and efficient than the traditional statistical indicators under the same circumstances. Besides, to silence the spike protein of SARS-CoV-2 to invade cells, this study further puts forward the SMRI method to search candidate high-efficiency siRNAs on SARS-CoV-2's S gene. This study is one of the early studies applying RNAi therapy to the COVID-19 treatment. According to the analysis, the average value of predicted interference efficiency of the candidate siRNAs designed by the SMRI method is comparable to that of the mainstream siRNA design algorithms. Moreover, the SMRI method ensures that the designed siRNAs have more than three base mismatches with human genes, thus avoiding silencing normal human genes. This is not considered by other mainstream methods, thereby the five candidate high-efficiency siRNAs which are easy to take effect or synthesize and much safer for human body are obtained by our SMRI method, which provide a new safer, small dosage and long efficacy solution for the treatment of COVID-19. Supplementary Information: The online version contains supplementary material available at 10.1007/s11390-021-0826-x.

12.
BMC Bioinformatics ; 22(1): 447, 2021 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-34544356

RESUMEN

BACKGROUND: Studies have proven that the same family of non-coding RNAs (ncRNAs) have similar functions, so predicting the ncRNAs family is helpful to the research of ncRNAs functions. The existing calculation methods mainly fall into two categories: the first type is to predict ncRNAs family by learning the features of sequence or secondary structure, and the other type is to predict ncRNAs family by the alignment among homologs sequences. In the first type, some methods predict ncRNAs family by learning predicted secondary structure features. The inaccuracy of predicted secondary structure may cause the low accuracy of those methods. Different from that, ncRFP directly learning the features of ncRNA sequences to predict ncRNAs family. Although ncRFP simplifies the prediction process and improves the performance, there is room for improvement in ncRFP performance due to the incomplete features of its input data. In the secondary type, the homologous sequence alignment method can achieve the highest performance at present. However, due to the need for consensus secondary structure annotation of ncRNA sequences, and the helplessness for modeling pseudoknots, the use of the method is limited. RESULTS: In this paper, a novel method "ncDLRES", which according to learning the sequence features, is proposed to predict the family of ncRNAs based on Dynamic LSTM (Long Short-term Memory) and ResNet (Residual Neural Network). CONCLUSIONS: ncDLRES extracts the features of ncRNA sequences based on Dynamic LSTM and then classifies them by ResNet. Compared with the homologous sequence alignment method, ncDLRES reduces the data requirement and expands the application scope. By comparing with the first type of methods, the performance of ncDLRES is greatly improved.


Asunto(s)
Biología Computacional , ARN no Traducido , Redes Neurales de la Computación , Conformación de Ácido Nucleico , ARN no Traducido/genética , Alineación de Secuencia
13.
BMC Bioinformatics ; 22(1): 169, 2021 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-33789581

RESUMEN

BACKGROUND: Studies have shown that RNA secondary structure, a planar structure formed by paired bases, plays diverse vital roles in fundamental life activities and complex diseases. RNA secondary structure profile can record whether each base is paired with others. Hence, accurate prediction of secondary structure profile can help to deduce the secondary structure and binding site of RNA. RNA secondary structure profile can be obtained through biological experiment and calculation methods. Of them, the biological experiment method involves two ways: chemical reagent and biological crystallization. The chemical reagent method can obtain a large number of prediction data, but its cost is high and always associated with high noise, making it difficult to get results of all bases on RNA due to the limited of sequencing coverage. By contrast, the biological crystallization method can lead to accurate results, yet heavy experimental work and high costs are required. On the other hand, the calculation method is CROSS, which comprises a three-layer fully connected neural network. However, CROSS can not completely learn the features of RNA secondary structure profile since its poor network structure, leading to its low performance. RESULTS: In this paper, a novel end-to-end method, named as "RPRes, was proposed to predict RNA secondary structure profile based on Bidirectional LSTM and Residual Neural Network. CONCLUSIONS: RPRes utilizes data sets generated by multiple biological experiment methods as the training, validation, and test sets to predict profile, which can compatible with numerous prediction requirements. Compared with the biological experiment method, RPRes has reduced the costs and improved the prediction efficiency. Compared with the state-of-the-art calculation method CROSS, RPRes has significantly improved performance.


Asunto(s)
Redes Neurales de la Computación , ARN , Sitios de Unión , Estructura Secundaria de Proteína , ARN/genética , Proyectos de Investigación
14.
Sensors (Basel) ; 20(13)2020 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-32635617

RESUMEN

Indoor positioning technologies are of great use in GPS-denied areas. They can be partitioned into two types of systems-infrastructure-free based and infrastructure-dependent based. WiFi based indoor positioning system is somewhere between the infrastructure-free and infrastructure-dependent systems. The reason is that in WiFi based systems, Access Points (APs) as pre-installed infrastructures are necessary. However, the APs do not need to be specially installed, because WiFi APs are already widely deployed in many indoor areas, for example, offices, malls and airports. This feature makes WiFi based indoor positioning suitable for many practical applications. In this paper, a seq2seq model based, deep learning method is proposed for WiFi based fingerprinting. The model can learn from different length of training sequences, and thus can exploit the context information for positioning. The context information denotes the information contained in the sequence, which can help finding the correspondences between RSS fingerprints and the coordinate positions. A simple example piece of context information is human walking routine (such as no sharp turns). The proposed method shows an improvement with an open source dataset, when compared against deep learning based counterpart methods.

15.
Sensors (Basel) ; 20(10)2020 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-32456362

RESUMEN

Radio frequency communication technology has not only greatly improved public network service, but also developed a new technological route for indoor navigation service. However, there is a gap between the precision and accuracy of indoor navigation services provided by indoor navigation service and the expectation of the public. This study proposed a method for constructing a hybrid dual frequency received signal strength indicator (HDRF-RSSI) fingerprint library, which is different from the traditional RSSI fingerprint library constructing method in indoor space using 2.4G radio frequency (RF) under the same Wi-Fi infrastructure condition. The proposed method combined 2.4G RF and 5G RF on the same access point (AP) device to construct a HDRF-RSSI fingerprint library, thereby doubling the fingerprint dimension of each reference point (RP). Experimental results show that the feature discriminability of HDRF-RSSI fingerprinting is 18.1% higher than 2.4G RF RSSI fingerprinting. Moreover, the hybrid radio frequency fingerprinting model, training loss function, and location evaluation algorithm based on the machine learning method were designed, so as to avoid limitation that transmission point (TP) and AP must be visible in the positioning method. In order to verify the effect of the proposed HDRF-RSSI fingerprint library construction method and the location evaluation algorithm, dual RF RSSI fingerprint data was collected to construct a fingerprint library in the experimental scene, which was trained using the proposed method. Several comparative experiments were designed to compare the positioning performance indicators such as precision and accuracy. Experimental results demonstrate that compared with the existing machine learning method based on Wi-Fi 2.4G RF RSSI fingerprint, the machine learning method combining Wi-Fi 5G RF RSSI vector and the original 2.4G RF RSSI vector can effectively improve the precision and accuracy of indoor positioning of the smart phone.

16.
BMC Genomics ; 19(Suppl 7): 669, 2018 Sep 24.
Artículo en Inglés | MEDLINE | ID: mdl-30255786

RESUMEN

BACKGROUND: Small interfering RNA (siRNA) can be used to post-transcriptional gene regulation by knocking down targeted genes. In functional genomics, biomedical research and cancer therapeutics, siRNA design is a critical research topic. Various computational algorithms have been developed to select the most effective siRNA, whereas the efficacy prediction accuracy is not so satisfactory. Many existing computational methods are based on feature engineering, which may lead to biased and incomplete features. Deep learning utilizes non-linear mapping operations to detect potential feature pattern and has been considered perform better than existing machine learning method. RESULTS: In this paper, to further improve the prediction accuracy and facilitate gene functional studies, we developed a new powerful siRNA efficacy predictor based on a deep architecture. First, we extracted hidden feature patterns from two modalities, including sequence context features and thermodynamic property. Then, we constructed a deep architecture to implement the prediction. On the available largest siRNA database, the performance of our proposed method was measured with 0.725 PCC and 0.903 AUC value. The comparative experiment showed that our proposed architecture outperformed several siRNA prediction methods. CONCLUSIONS: The results demonstrate that our deep architecture is stable and efficient to predict siRNA silencing efficacy. The method could help select candidate siRNA for targeted mRNA, and further promote the development of RNA interference.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Silenciador del Gen , Marcación de Gen/métodos , Redes Neurales de la Computación , ARN Interferente Pequeño/genética , Humanos , Aprendizaje Automático , ARN Mensajero/genética , ARN Interferente Pequeño/química
17.
BMC Genomics ; 19(1): 839, 2018 Nov 26.
Artículo en Inglés | MEDLINE | ID: mdl-30477446

RESUMEN

BACKGROUND: An increasing number of studies reported that exogenous miRNAs (xenomiRs) can be detected in animal bodies, however, some others reported negative results. Some attributed this divergence to the selective absorption of plant-derived xenomiRs by animals. RESULTS: Here, we analyzed 166 plant-derived xenomiRs reported in our previous study and 942 non-xenomiRs extracted from miRNA expression profiles of four species of commonly consumed plants. Employing statistics analysis and cluster analysis, our study revealed the potential sequence specificity of plant-derived xenomiRs. Furthermore, a random forest model and a one-dimensional convolutional neural network model were trained using miRNA sequence features and raw miRNA sequences respectively and then employed to predict unlabeled plant miRNAs in miRBase. A total of 241 possible plant-derived xenomiRs were predicted by both models. Finally, the potential functions of these possible plant-derived xenomiRs along with our previously reported ones in human body were analyzed. CONCLUSIONS: Our study, for the first time, presents the systematic plant-derived xenomiR sequences analysis and provides evidence for selective absorption of plant miRNA by human body, which could facilitate the future investigation about the mechanisms underlying the transference of plant-derived xenomiR.


Asunto(s)
MicroARNs/genética , Redes Neurales de la Computación , Plantas/genética , ARN de Planta/genética , Biomarcadores/análisis , Bases de Datos de Ácidos Nucleicos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
18.
Biochem Biophys Res Commun ; 503(3): 1491-1497, 2018 09 10.
Artículo en Inglés | MEDLINE | ID: mdl-30029874

RESUMEN

Recent evidence suggests that microRNAs play important roles in the negative post-transcriptional regulators with altered expression levels found in gastric cancer (GC). Therefore, we employed explore the anti-cancer miRNA and the potential mechanisms by which miRNAs modulate GC progression. We have predicted GC miRNA expression data sets in TargetScan. miR-5590-3p is higher in adjacent nonmalignant tissue than in cancer tissue in 42 pairs of GC tissues. Functional assays, CCK-8 and colony formation assay, were used to determine the Anti-cancer role of miR-5590-3p in human GC progression. In addition, Ago2-based RIP and dual-luciferase reporter assay were conducted to study the miR-5590-3p as a direct target of DDX5. Next, Xenograft nude mouse models were used to determine the role of miR-5590-3p in GC tumorigenicity in vivo. Upregulation of miR-5590-3p suppressed GC cell proliferation, whereas downregulation of miR-5590-3p promoted GC proliferation in vitro. Furthermore, we identified DDX5 as a direct target of miR-5590-3p, and that the biological function of miR-5590-3p during GC progression in vitro and in vivo is through the DDX5/AKT/m-TOR pathway and downstream cyclinD1 and CDK2 expression. Finally, we confirmed the effect of miR-5590-3p directly targeting DDX5 on the development of gastric cancer through salvage experiments in vivo and in vitro.


Asunto(s)
ARN Helicasas DEAD-box/antagonistas & inhibidores , MicroARNs/farmacología , Proteínas Proto-Oncogénicas c-akt/antagonistas & inhibidores , Neoplasias Gástricas/tratamiento farmacológico , Serina-Treonina Quinasas TOR/antagonistas & inhibidores , Animales , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , ARN Helicasas DEAD-box/metabolismo , Humanos , Ratones , Ratones Endogámicos BALB C , Ratones Desnudos , MicroARNs/genética , MicroARNs/metabolismo , Neoplasias Experimentales/tratamiento farmacológico , Neoplasias Experimentales/metabolismo , Neoplasias Experimentales/patología , Proteínas Proto-Oncogénicas c-akt/metabolismo , Neoplasias Gástricas/metabolismo , Neoplasias Gástricas/patología , Serina-Treonina Quinasas TOR/metabolismo
20.
Bioinformatics ; 30(18): 2576-83, 2014 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-24845652

RESUMEN

MOTIVATION: Whole-genome sequencing of tumor samples has been demonstrated as an efficient approach for comprehensive analysis of genomic aberrations in cancer genome. Critical issues such as tumor impurity and aneuploidy, GC-content and mappability bias have been reported to complicate identification of copy number alteration and loss of heterozygosity in complex tumor samples. Therefore, efficient computational methods are required to address these issues. RESULTS: We introduce CLImAT (CNA and LOH Assessment in Impure and Aneuploid Tumors), a bioinformatics tool for identification of genomic aberrations from tumor samples using whole-genome sequencing data. Without requiring a matched normal sample, CLImAT takes integrated analysis of read depth and allelic frequency and provides extensive data processing procedures including GC-content and mappability correction of read depth and quantile normalization of B-allele frequency. CLImAT accurately identifies copy number alteration and loss of heterozygosity even for highly impure tumor samples with aneuploidy. We evaluate CLImAT on both simulated and real DNA sequencing data to demonstrate its ability to infer tumor impurity and ploidy and identify genomic aberrations in complex tumor samples. AVAILABILITY AND IMPLEMENTATION: The CLImAT software package can be freely downloaded at http://bioinformatics.ustc.edu.cn/CLImAT/.


Asunto(s)
Aneuploidia , Variaciones en el Número de Copia de ADN/genética , Genómica/métodos , Pérdida de Heterocigocidad , Análisis de Secuencia de ADN , Programas Informáticos , Neoplasias de la Mama Triple Negativas/genética , Frecuencia de los Genes , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA