Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
BMC Genomics ; 23(Suppl 1): 301, 2022 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-35418074

RESUMO

BACKGROUND: Nucleosome positioning is the precise determination of the location of nucleosomes on DNA sequence. With the continuous advancement of biotechnology and computer technology, biological data is showing explosive growth. It is of practical significance to develop an efficient nucleosome positioning algorithm. Indeed, convolutional neural networks (CNN) can capture local features in DNA sequences, but ignore the order of bases. While the bidirectional recurrent neural network can make up for CNN's shortcomings in this regard and extract the long-term dependent features of DNA sequence. RESULTS: In this work, we use word vectors to represent DNA sequences and propose three new deep learning models for nucleosome positioning, and the integrative model NP_CBiR reaches a better prediction performance. The overall accuracies of NP_CBiR on H. sapiens, C. elegans, and D. melanogaster datasets are 86.18%, 89.39%, and 85.55% respectively. CONCLUSIONS: Benefited by different network structures, NP_CBiR can effectively extract local features and bases order features of DNA sequences, thus can be considered as a complementary tool for nucleosome positioning.


Assuntos
Aprendizado Profundo , Nucleossomos , Animais , Sequência de Bases , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Nucleossomos/genética , Extratos Vegetais
2.
BMC Bioinformatics ; 22(Suppl 6): 129, 2021 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-34078256

RESUMO

BACKGROUND: Nucleosome plays an important role in the process of genome expression, DNA replication, DNA repair and transcription. Therefore, the research of nucleosome positioning has invariably received extensive attention. Considering the diversity of DNA sequence representation methods, we tried to integrate multiple features to analyze its effect in the process of nucleosome positioning analysis. This process can also deepen our understanding of the theoretical analysis of nucleosome positioning. RESULTS: Here, we not only used frequency chaos game representation (FCGR) to construct DNA sequence features, but also integrated it with other features and adopted the principal component analysis (PCA) algorithm. Simultaneously, support vector machine (SVM), extreme learning machine (ELM), extreme gradient boosting (XGBoost), multilayer perceptron (MLP) and convolutional neural networks (CNN) are used as predictors for nucleosome positioning prediction analysis, respectively. The integrated feature vector prediction quality is significantly superior to a single feature. After using principal component analysis (PCA) to reduce the feature dimension, the prediction quality of H. sapiens dataset has been significantly improved. CONCLUSIONS: Comparative analysis and prediction on H. sapiens, C. elegans, D. melanogaster and S. cerevisiae datasets, demonstrate that the application of FCGR to nucleosome positioning is feasible, and we also found that integrative feature representation would be better.


Assuntos
Caenorhabditis elegans , Nucleossomos , Algoritmos , Animais , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Aprendizado de Máquina , Nucleossomos/genética , Saccharomyces cerevisiae/genética , Máquina de Vetores de Suporte
3.
Chaos ; 31(2): 023115, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33653076

RESUMO

To precisely analyze the fractal nature of a short-term time series under the multiscale framework, this study introduces multiscale adaptive multifractal analysis (MAMFA) combining the adaptive fractal analysis method with the multiscale multifractal analysis (MMA). MAMFA and MMA are both applied to the two kinds of simulation sequences, and the results show that the MAMFA method achieves better performances than MMA. MAMFA is also applied to the Chinese and American stock indexes and the R-R interval of heart rate data. It is found that the multifractal characteristics of stock sequences are related to the selection of the scale range s. There is a big difference in the Hurst surface's shape of Chinese and American stock indexes and Chinese stock indexes have more obvious multifractal characteristics. For the R-R interval sequence, we find that the subjects with abnormal heart rate have significant shape changes in three areas of Hurst surface compared with healthy subjects, thereby patients can be effectively distinguished from healthy subjects.


Assuntos
Fractais , Simulação por Computador , Frequência Cardíaca , Humanos
4.
Chaos ; 30(5): 053113, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32491907

RESUMO

A novel general randomized method is proposed to investigate multifractal properties of long time series. Based on multifractal temporally weighted detrended fluctuation analysis (MFTWDFA), we obtain randomized multifractal temporally weighted detrended fluctuation analysis (RMFTWDFA). The innovation of this algorithm is applying a random idea in the process of dividing multiple intervals to find the local trend. To test the performance of the RMFTWDFA algorithm, we apply it, together with the MFTWDFA, to the artificially generated time series and real genomic sequences. For three types of artificially generated time series, consistency tests are performed on the estimated h(q), and all results indicate that there is no significant difference in the estimated h(q) of the two methods. Meanwhile, for different sequence lengths, the running time of RMFTWDFA is reduced by over ten times. We use prokaryote genomic sequences with large scales as real examples, the results obtained by RMFTWDFA demonstrate that these genomic sequences show fractal characteristics, and we leverage estimated exponents to study phylogenetic relationships between species. The final clustering results are consistent with real relationships. All the results reflect that RMFTWDFA is significantly effective and timesaving for long time series, while obtaining an accuracy statistically comparable to other methods.


Assuntos
Fractais , Filogenia , Algoritmos , Bactérias/genética , Bases de Dados Genéticas
5.
Mol Phylogenet Evol ; 89: 37-45, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25882834

RESUMO

There has been a growing interest in alignment-free methods for whole genome comparison and phylogenomic studies. In this study, we propose an alignment-free method for phylogenetic tree construction using whole-proteome sequences. Based on the inter-amino-acid distances, we first convert the whole-proteome sequences into inter-amino-acid distance vectors, which are called observed inter-amino-acid distance profiles. Then, we propose to use conditional geometric distribution profiles (the distributions of sequences where the amino acids are placed randomly and independently) as the reference distribution profiles. Last the relative deviation between the observed and reference distribution profiles is used to define a simple metric that reflects the phylogenetic relationships between whole-proteome sequences of different organisms. We name our method inter-amino-acid distances and conditional geometric distribution profiles (IAGDP). We evaluate our method on two data sets: the benchmark dataset including 29 genomes used in previous published papers, and another one including 67 mammal genomes. Our results demonstrate that the new method is useful and efficient.


Assuntos
Aminoácidos/análise , Filogenia , Proteoma/análise , Proteoma/química , Aminoácidos/química , Animais , Sequência de Bases , Bases de Dados Genéticas , Genoma/genética , Mamíferos/genética , Proteoma/genética
6.
Cancer Cell Int ; 15: 21, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25792973

RESUMO

BACKGROUND: PPP2R2C encodes a gamma isoform of the regulatory subunit B55 subfamily consisting PP2A heterotrimeric with A and C subunits. Currently, the precise functions of B55gamma in cancer are still under investigating. In this project, we reported a novel function of B55gamma in the regulation of glucose metabolism in Glioma cells. METHODS: Western blot and immunoprecipitation were performed to determine protein expression and interaction. Cell viability was measured by Typan Blue staining and direct cell counting using hematocytometer. siRNA technology was used to down regulate protein expression. RESULTS: Glucose uptake and lactate product were suppressed by overexpression of B55gamma in Glioma cells. In addition, cancer cells with larger amount of B55gamma showed higher survival advantages in response to glucose starvation through the dephosphorylation of S6K. From proteomic analysis, we found B55gamma binds with and up regulates SIK2 through the stabilization of SIK2 protein which is required for the B55gamma-mediated suppression of S6K pathway. Knocking down of SIK2 in B55gamma over expressing cells recovered the phosphorylation of S6K. CONCLUSION: In summary, our project will provide novel insight into the design and development of therapeutic strategies to target the B55gamma-mediated glucose metabolism for the treatment of human brain tumor patients.

7.
J Theor Biol ; 344: 31-9, 2014 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-24316387

RESUMO

Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php.


Assuntos
Aminoácidos/classificação , Proteínas de Membrana/química , Máquina de Vetores de Suporte , Algoritmos , Aminoácidos/química , Animais , Físico-Química , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas de Membrana/análise
8.
Interdiscip Sci ; 16(1): 176-191, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38099958

RESUMO

Since the identification of microRNAs (miRNAs), empirical research has demonstrated their crucial involvement in the functioning of organisms. Investigating miRNAs significantly bolsters efforts related to averting, diagnosing, and treating intricate human maladies. Yet, exploring every conceivable miRNA-disease association consumes significant resources and time within conventional wet experiments. On the computational front, forecasting potential miRNA-disease connections serves as a valuable source of preliminary insights for medical investigators. As a result, we have developed a novel matrix factorization model known as Hessian-regularized [Formula: see text] nonnegative matrix factorization in combination with deep learning for predicting associations between miRNAs and diseases, denoted as [Formula: see text]-NMF-DF. In particular, we introduce a novel iterative fusion approach to integrate all similarities. This method effectively diminishes the sparsity of the initial miRNA-disease associations matrix. Additionally, we devise a mixed model framework that utilizes deep learning, matrix decomposition, and singular value decomposition to capture and depict the intricate nonlinear features of miRNA and disease. The prediction performance of the six matrix factorization methods is improved by comparison and analysis, similarity matrix fusion, data preprocessing, and parameter adjustment. The AUC and AUPR obtained by the new matrix factorization model under fivefold cross validation are comparative or better with other matrix factorization models. Finally, we select three diseases including lung tumor, bladder tumor and breast tumor for case analysis, and further extend the matrix factorization model based on deep learning. The results show that the hybrid algorithm combining matrix factorization with deep learning proposed in this paper can predict miRNAs related to different diseases with high accuracy.


Assuntos
Aprendizado Profundo , Neoplasias Pulmonares , MicroRNAs , Humanos , MicroRNAs/genética , Algoritmos , Curva ROC , Biologia Computacional/métodos , Predisposição Genética para Doença
9.
Neuroscience ; 538: 46-58, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38110170

RESUMO

Ischemia-reperfusion (IR) induces a wide range of irreversible injuries. Cerebral IR injury (IRI) refers to additional brain tissue damage that occurs after blood flow is restored following cerebral ischemia. Currently, no established methods exist for treating IRI. Oxidative stress is recognized as a primary mechanism initiating IRI and a crucial focal target for its treatment. Urolithin B, a metabolite derived from ellagitannins, antioxidant polyphenols, has demonstrated protective effects against oxidative stress in various disease conditions. However, the precise mechanism underlying UB's effect on IRI remains unclear. In our current investigation, we assessed UB's ability to mitigate neurological functional impairment induced by IR using a neurological deficit score. Additionally, we examined cerebral infarction following UB administration through TTC staining and neuron Nissl staining. UB's inhibition of neuronal apoptosis was demonstrated through the TUNEL assay and Caspase-3 measurement. Additionally, we examined UB's effect on oxidative stress levels by analyzing malondialdehyde (MDA) concentration, superoxide dismutase (SOD) activity, and immunohistochemistry analysis of inducible nitric oxide synthase (iNOS) and 8-hydroxyl-2'-deoxyguanosine (8-OHdG). Notably, UB demonstrated a reduction in oxidative stress levels. Mechanistically, UB was found to stimulate the Nrf2/HO-1 signaling pathway, as evidenced by the significant reduction in UB's neuroprotective effects upon administration of ATRA, an Nrf2 inhibitor. In summary, UB effectively inhibits oxidative stress induced by IR through the activation of the Nrf2/HO-1 signaling pathway. These findings suggest that UB holds promise as a therapeutic agent for the treatment of IRI.


Assuntos
Isquemia Encefálica , Cumarínicos , Fármacos Neuroprotetores , Traumatismo por Reperfusão , Ratos , Animais , Ratos Sprague-Dawley , Fator 2 Relacionado a NF-E2/metabolismo , Isquemia Encefálica/tratamento farmacológico , Isquemia Encefálica/metabolismo , Estresse Oxidativo , Infarto Cerebral , Traumatismo por Reperfusão/tratamento farmacológico , Traumatismo por Reperfusão/metabolismo , Fármacos Neuroprotetores/farmacologia , Fármacos Neuroprotetores/uso terapêutico
10.
BMC Cancer ; 13: 478, 2013 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-24124917

RESUMO

BACKGROUND: MiR-106a is frequently down-regulated in various types of human cancer. However the underlying mechanism of miR-106a involved in glioma remains elusive. METHODS: The association of miR-106a with glioma grade and patient survival was analyzed. The biological function and target of miR-106a were determined by bioinformatic analysis and cell experiments (Western blot, luciferase reporter, cell cycle, ntracellular ATP production and glucose uptake assay). Finally, rescue expression of its target SLC2A3 was used to test the role of SLC2A3 in miR-106a-mediated cell glycolysis and proliferation. RESULTS: Here we showed that miR-106a was a tumor suppressor miRNA was involved in GBM cell glucose uptake and proliferation. Decreased miR-106a in GBM tissues and conferred a poor survival of GBM patients. SLC2A3 was identified as a core target of miR-106a in GBM cells. Inhibition of SLC2A3 by miR-106a attenuated cell proliferation and inhibited glucose uptake. In addition, for each biological process we identified ontology-associated transcripts that significantly correlated with SLC2A3 expression. Finally, the expression of SLC2A3 largely abrogated miR-106a-mediated cell proliferation and glucose uptake in GBM cells. CONCLUSIONS: Taken together, miR-106a and SLC2A3 could be potential therapeutic approaches for GBM.


Assuntos
Glioblastoma/genética , Glioblastoma/metabolismo , Transportador de Glucose Tipo 3/genética , Glucose/metabolismo , MicroRNAs/genética , Pareamento de Bases , Sequência de Bases , Linhagem Celular Tumoral , Proliferação de Células , Análise por Conglomerados , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Glioblastoma/mortalidade , Transportador de Glucose Tipo 3/metabolismo , Glicólise , Humanos , MicroRNAs/metabolismo , Prognóstico , Interferência de RNA
11.
Front Cell Infect Microbiol ; 13: 1117421, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36779183

RESUMO

Introduction: The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research. Method and Results: Firstly, we visualized the chaotic game representation (CGR) of simulated metagenomes and real metagenomes. We find that metagenomes are visualized with self-similarity. Then we defined and calculated the multifractal dimension for the visualized plot of simulated and real metagenomes, respectively. By analyzing the Pearson correlation coefficients between the multifractal dimension and the traditional species diversity index, we obtain that the correlation coefficients between the multifractal dimension and the species richness index and Shannon diversity index reached the maximum value when q = 0, 1, and the correlation coefficient between the multifractal dimension and the Simpson diversity index reached the maximum value when q = 5. Finally, we apply our method to real metagenomes of the gut microbiota of 100 infants who are newborn and 4 and 12 months old. The results show that the multifractal dimensions of an infant's gut microbiomes can distinguish age differences. Conclusion and Discussion: There is self-similarity among the CGRs of WGS of metagenomes, and the multifractal spectrum is an important characteristic for metagenomes. The traditional diversity indicators can be unified under the framework of multifractal analysis. These results coincided with similar results in macrobial ecology. The multifractal spectrum of infants' gut microbiomes are related to the development of the infants.


Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Lactente , Recém-Nascido , Metagenoma , Microbiota/genética , Microbioma Gastrointestinal/genética , Metagenômica/métodos , Ecologia
12.
Front Genet ; 12: 766496, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34745231

RESUMO

Alignment methods have faced disadvantages in sequence comparison and phylogeny reconstruction due to their high computational costs in handling time and space complexity. On the other hand, alignment-free methods incur low computational costs and have recently gained popularity in the field of bioinformatics. Here we propose a new alignment-free method for phylogenetic tree reconstruction based on whole genome sequences. A key component is a measure called information-entropy position-weighted k-mer relative measure (IEPWRMkmer), which combines the position-weighted measure of k-mers proposed by our group and the information entropy of frequency of k-mers. The Manhattan distance is used to calculate the pairwise distance between species. Finally, we use the Neighbor-Joining method to construct the phylogenetic tree. To evaluate the performance of this method, we perform phylogenetic analysis on two datasets used by other researchers. The results demonstrate that the IEPWRMkmer method is efficient and reliable. The source codes of our method are provided at https://github.com/ wuyaoqun37/IEPWRMkmer.

13.
Biomedicines ; 9(9)2021 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-34572337

RESUMO

Abnormal miRNA functions are widely involved in many diseases recorded in the database of experimentally supported human miRNA-disease associations (HMDD). Some of the associations are complicated: There can be up to five heterogeneous association types of miRNA with the same disease, including genetics type, epigenetics type, circulating miRNAs type, miRNA tissue expression type and miRNA-target interaction type. When one type of association is known for an miRNA-disease pair, it is important to predict any other types of the association for a better understanding of the disease mechanism. It is even more important to reveal associations for currently unassociated miRNAs and diseases. Methods have been recently proposed to make predictions on the association types of miRNA-disease pairs through restricted Boltzman machines, label propagation theories and tensor completion algorithms. None of them has exploited the non-linear characteristics in the miRNA-disease association network to improve the performance. We propose to use attributed multi-layer heterogeneous network embedding to learn the latent representations of miRNAs and diseases from each association type and then to predict the existence of the association type for all the miRNA-disease pairs. The performance of our method is compared with two newest methods via 10-fold cross-validation on the database HMDD v3.2 to demonstrate the superior prediction achieved by our method under different settings. Moreover, our real predictions made beyond the HMDD database can be all validated by NCBI literatures, confirming that our method is capable of accurately predicting new associations of miRNAs with diseases and their association types as well.

14.
Int J Mol Sci ; 11(3): 1141-54, 2010 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-20480005

RESUMO

A shortcoming of most correlation distance methods based on the composition vectors without alignment developed for phylogenetic analysis using complete genomes is that the "distances" are not proper distance metrics in the strict mathematical sense. In this paper we propose two new correlation-related distance metrics to replace the old one in our dynamical language approach. Four genome datasets are employed to evaluate the effects of this replacement from a biological point of view. We find that the two proper distance metrics yield trees with the same or similar topologies as/to those using the old "distance" and agree with the tree of life based on 16S rRNA in a majority of the basic branches. Hence the two proper correlation-related distance metrics proposed here improve our dynamical language approach for phylogenetic analysis.


Assuntos
Algoritmos , Genômica/métodos , Filogenia , Alinhamento de Sequência/métodos , Animais , Genoma Bacteriano , Genoma de Planta
15.
J Clin Neurosci ; 16(10): 1291-5, 2009 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-19577930

RESUMO

Degenerative lumbar spinal stenosis (DLSS) can be treated by several surgical procedures. However, the choice of procedure and use of instrumentation remain controversial. In this retrospective study of 81 patients with DLSS, 43 patients received decompression and posterolateral fusion without instrumentation, and the surgery for 38 patients was supplemented with posterior transpedicular screw fixation. Both surgeon-based (Fischgrund criteria) and patient-based (Medical Outcome Trust Short-Form 36 [SF-36] questionnaire) standards were used to assess the clinical outcomes. An excellent to good result was achieved in 71.6% of patients and there was no significant difference 6.2 years later between groups with or without instrumentation (Z=0.0358, p>0.05). SF-36 data revealed significant postoperative improvement (p<0.01), and there was no significant difference between the two groups (t=1.67, p>0.05). Successful fusion occurred in 87% of patients with instrumentation versus 67% of the patients without instrumentation (chi(2)=4.23, p<0.05). Thus, surgical treatment of DLSS generally results in satisfactory outcomes. Transpedicular screw fixation may not improve clinical outcomes and the use of posterior instrumentation should be adopted cautiously.


Assuntos
Vértebras Lombares/cirurgia , Doenças Neurodegenerativas/cirurgia , Fusão Vertebral/métodos , Estenose Espinal/cirurgia , Adulto , Idoso , Parafusos Ósseos , Feminino , Humanos , Fixadores Internos , Vértebras Lombares/diagnóstico por imagem , Masculino , Pessoa de Meia-Idade , Doenças Neurodegenerativas/complicações , Doenças Neurodegenerativas/diagnóstico por imagem , Fusão Vertebral/instrumentação , Estenose Espinal/complicações , Estenose Espinal/diagnóstico por imagem , Estatísticas não Paramétricas , Tomografia Computadorizada por Raios X/métodos , Resultado do Tratamento
16.
Front Genet ; 10: 1325, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32117407

RESUMO

Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, etc. The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.

17.
Comput Biol Chem ; 57: 21-8, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25736609

RESUMO

Random walk on heterogeneous networks is a recently emerging approach to effective disease gene prioritization. Laplacian normalization is a technique capable of normalizing the weight of edges in a network. We use this technique to normalize the gene matrix and the phenotype matrix before the construction of the heterogeneous network, and also use this idea to define the transition matrices of the heterogeneous network. Our method has remarkably better performance than the existing methods for recovering known gene-phenotype relationships. The Shannon information entropy of the distribution of the transition probabilities in our networks is found to be smaller than the networks constructed by the existing methods, implying that a higher number of top-ranked genes can be verified as disease genes. In fact, the most probable gene-phenotype relationships ranked within top 3 or top 5 in our gene lists can be confirmed by the OMIM database for many cases. Our algorithms have shown remarkably superior performance over the state-of-the-art algorithms for recovering gene-phenotype relationships. All Matlab codes can be available upon email request.


Assuntos
Biologia Computacional , Doença/genética , Redes Reguladoras de Genes , Algoritmos , Entropia , Humanos , Fenótipo
18.
PLoS One ; 8(2): e57225, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23460833

RESUMO

BACKGROUND: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. METHODOLOGY/PRINCIPAL FINDINGS: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. CONCLUSIONS: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.


Assuntos
Núcleo Celular/metabolismo , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Modelos Moleculares , Transporte Proteico , Curva ROC , Reprodutibilidade dos Testes , Frações Subcelulares/metabolismo , Máquina de Vetores de Suporte
19.
PLoS One ; 7(7): e42154, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22848736

RESUMO

BACKGROUND: The composition vector (CV) method has been proved to be a reliable and fast alignment-free method to analyze large COI barcoding data. In this study, we modify this method for analyzing multi-gene datasets for plant DNA barcoding. The modified method includes an adjustable-weighted algorithm for the vector distance according to the ratio in sequence length of the candidate genes for each pair of taxa. METHODOLOGY/PRINCIPAL FINDINGS: Three datasets, matK+rbcL dataset with 2,083 sequences, matK+rbcL dataset with 397 sequences and matK+rbcL+trnH-psbA dataset with 397 sequences, were tested. We showed that the success rates of grouping sequences at the genus/species level based on this modified CV approach are always higher than those based on the traditional K2P/NJ method. For the matK+rbcL datasets, the modified CV approach outperformed the K2P-NJ approach by 7.9% in both the 2,083-sequence and 397-sequence datasets, and for the matK+rbcL+trnH-psbA dataset, the CV approach outperformed the traditional approach by 16.7%. CONCLUSIONS: We conclude that the modified CV approach is an efficient method for analyzing large multi-gene datasets for plant DNA barcoding. Source code, implemented in C++ and supported on MS Windows, is freely available for download at http://math.xtu.edu.cn/myphp/math/research/source/Barcode_source_codes.zip.


Assuntos
Código de Barras de DNA Taxonômico/métodos , Interpretação Estatística de Dados , Loci Gênicos/genética , Plantas/classificação , Plantas/genética
20.
J Clin Neurosci ; 16(11): 1443-8, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19683929

RESUMO

The aim of this study is to evaluate an integrated cage and plate device (the plate cage Benezech, PCB) filled with autogenous bone in anterior cervical discectomy and fusion. The fused segment height, lordosis, and fusion were assessed by postoperative radiographic examination at different intervals. Patients were evaluated using Odom's criteria and the Short Form (SF)-36 Health Survey questionnaire. The mean follow-up duration was 4.1 years. Fusion was achieved in 90.0%, 96.0% and 100% of patients at 3 months, 6 months and at final visit, respectively. The fused segment height and lordosis were restored and maintained. Cage subsidence (3mm) occurred at one level and settling was observed at three levels. An excellent-to-good result was achieved in 81.8% of patients. The data from the SF-36 questionnaire revealed significant postoperative improvement (p<0.01) except for social function and mental health. This study suggests that patients instrumented with PCB can obtain good radiographic and clinical results and that PCB is a safe and effective device in cervical anterior fusion.


Assuntos
Placas Ósseas , Discotomia , Traumatismos da Medula Espinal/cirurgia , Fusão Vertebral/instrumentação , Fusão Vertebral/métodos , Adulto , Idoso , Vértebras Cervicais/diagnóstico por imagem , Vértebras Cervicais/patologia , Discotomia/métodos , Feminino , Inquéritos Epidemiológicos , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Traumatismos da Medula Espinal/diagnóstico por imagem , Traumatismos da Medula Espinal/patologia , Inquéritos e Questionários , Tomografia Computadorizada por Raios X/métodos , Resultado do Tratamento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA