Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Int J Mol Sci ; 23(22)2022 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-36430822

RESUMO

Chronic myeloid leukemia (CML) is a myeloproliferative disease characterized by a unique BCR-ABL fusion gene. Tyrosine kinase inhibitors (TKIs) were developed to target the BCR-ABL oncoprotein, inhibiting its abnormal kinase activity. TKI treatments have significantly improved CML patient outcomes. However, the patients can develop drug resistance and relapse after therapy discontinues largely due to intratumor heterogeneity. It is critical to understand the differences in therapeutic responses among subpopulations of cells. Single-cell RNA sequencing measures the transcriptome of individual cells, allowing us to differentiate and analyze individual cell populations. Here, we integrated a single-cell RNA sequencing profile of CML stem cells and network analysis to decipher the mechanisms of distinct TKI responses. Compared to normal hematopoietic stem cells, a set of genes that were concordantly differentially expressed in various types of stem cells of CML patients was revealed. Further transcription regulatory network analysis found that most of these genes were directly controlled by one or more transcript factors and the genes have more regulators in the cells of the patients who responded to the treatment. The molecular markers including a known drug-resistance gene and novel gene signatures for treatment response were also identified. Moreover, we combined protein-protein interaction network construction with a cancer drug database and uncovered the drugs that target the marker genes directly or indirectly via the protein interactions. The gene signatures and their interacted proteins identified by this work can be used for treatment response prediction and lead to new strategies for drug resistance monitoring and prevention. Our single-cell-based findings offered novel insights into the mechanisms underlying the therapeutic response of CML.


Assuntos
Leucemia Mielogênica Crônica BCR-ABL Positiva , Transcriptoma , Humanos , Inibidores de Proteínas Quinases/farmacologia , Inibidores de Proteínas Quinases/uso terapêutico , Resistencia a Medicamentos Antineoplásicos/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/tratamento farmacológico , Leucemia Mielogênica Crônica BCR-ABL Positiva/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/patologia , Proteínas de Fusão bcr-abl
2.
BMC Bioinformatics ; 19(1): 181, 2018 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-29793423

RESUMO

After publication of the original article [1], it was noticed that the Acknowledgement statement was incorrect. The original statement reads.

3.
BMC Bioinformatics ; 18(Suppl 14): 489, 2017 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-29297275

RESUMO

BACKGROUND: Long noncoding RNAs (lncRNAs) are involved in diverse biological processes and play an essential role in various human diseases. The number of lncRNAs identified has increased rapidly in recent years owing to RNA sequencing (RNA-Seq) technology. However, presently, most lncRNAs are not well characterized, and their regulatory mechanisms remain elusive. Many lncRNAs show poor evolutionary conservation. Thus, the lncRNAs that are conserved across species can provide insight into their critical functional roles. RESULTS: Here, we performed an orthologous analysis of lncRNAs in human and rat brain tissues. Over two billion RNA-Seq reads generated from 80 human and 66 rat brain tissue samples were analyzed. Our analysis revealed a total of 351 conserved human lncRNAs corresponding to 646 rat lncRNAs. Among these human lncRNAs, 140 were newly identified by our study, and 246 were present in known lncRNA databases; however, the majority of the lncRNAs that have been identified are not yet functionally annotated. We constructed co-expression networks based on the expression profiles of conserved human lncRNAs and protein-coding genes, and produced 79 co-expression modules. Gene ontology (GO) analysis of the co-expression modules suggested that the conserved lncRNAs were involved in various functions such as brain development (P-value = 1.12E-2), nervous system development (P-value = 1.26E-3), and cerebral cortex development (P-value = 1.31E-2). We further predicted the interactions between lncRNAs and protein-coding genes to better understand the regulatory mechanisms of lncRNAs. Moreover, we investigated the expression patterns of the conserved lncRNAs at different time points during rat brain growth. We found that the expression levels of three out of four such lncRNA genes continuously increased from week 2 to week 104, which is consistent with our functional annotation. CONCLUSION: Our orthologous analysis of lncRNAs in human and rat brain tissues revealed a set of conserved lncRNAs. Further expression analysis provided the functional annotation of these lncRNAs in humans and rats. Our results offer new targets for developing better experimental designs to investigate regulatory molecular mechanisms of lncRNAs and the roles lncRNAs play in brain development. Additionally, our method could be generalized to study and characterize lncRNAs conserved in other species and tissue types.


Assuntos
Encéfalo/metabolismo , Sequência Conservada/genética , RNA Longo não Codificante/genética , Animais , Perfilação da Expressão Gênica , Ontologia Genética , Redes Reguladoras de Genes , Humanos , Anotação de Sequência Molecular , Fases de Leitura Aberta/genética , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Ratos , Fatores de Tempo
4.
Hum Genomics ; 10 Suppl 2: 19, 2016 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-27461468

RESUMO

BACKGROUND: Green tea polyphenol epigallocatechin-3-gallate (EGCG) has been demonstrated to inhibit cancer in experimental studies through its antioxidant activity and modulations on cellular functions by binding specific proteins. By means of computational analysis and functional genomic approaches, we previously identified a set of protein coding genes and microRNAs whose expressions were significantly modulated in response to the EGCG treatment in tobacco carcinogen-induced lung adenocarcinoma in A/J mice. However, to what degree these genes are involved in the cancer inhibition of EGCG remains unclear. RESULTS: In this study, we further employed statistical methods and literature research to analyze these data in combination with The Cancer Genome Atlas (TCGA) lung adenocarcinoma datasets for additional data mining. Under the assumption that, if a gene mediates EGCG's cancer inhibition, its expression level change caused by EGCG should be opposite to what occurred in the carcinogenesis, we identified Myb and Peg3 as the primary putative genes involved in the cancer inhibitory activity. Further analysis suggested that the regulation of Myb could be mediated through an EGCG-upregulated microRNA, miR-449c-5p. CONCLUSIONS: Although the actions of EGCG involve multiple targets/pathways, further analysis by mining the existing genomic datasets revealed that the upregulations of Myb and Peg3 are likely the key anti-cancer events of EGCG in vivo.


Assuntos
Adenocarcinoma/tratamento farmacológico , Catequina/análogos & derivados , Neoplasias Pulmonares/tratamento farmacológico , MicroRNAs/genética , Proteínas Proto-Oncogênicas c-myb/genética , Regulação para Cima/efeitos dos fármacos , Adenocarcinoma/genética , Animais , Antioxidantes/farmacologia , Catequina/farmacologia , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Redes Reguladoras de Genes , Fatores de Transcrição Kruppel-Like/genética , Pulmão/efeitos dos fármacos , Pulmão/metabolismo , Pulmão/patologia , Neoplasias Pulmonares/genética , Camundongos Endogâmicos
5.
Hum Genomics ; 10 Suppl 2: 21, 2016 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-27461004

RESUMO

BACKGROUND: Chronic inflammation has been widely considered to be the major risk factor of coronary heart disease (CHD). The goal of our study was to explore the possible association with CHD for inflammation-related single nucleotide polymorphisms (SNPs) involved in cytosine-phosphate-guanine (CpG) dinucleotides. A total of 784 CHD patients and 739 non-CHD controls were recruited from Zhejiang Province, China. Using the Sequenom MassARRAY platform, we measured the genotypes of six inflammation-related CpG-SNPs, including IL1B rs16944, IL1R2 rs2071008, PLA2G7 rs9395208, FAM5C rs12732361, CD40 rs1800686, and CD36 rs2065666). Allele and genotype frequencies were compared between CHD and non-CHD individuals using the CLUMP22 software with 10,000 Monte Carlo simulations. RESULTS: Allelic tests showed that PLA2G7 rs9395208 and CD40 rs1800686 were significantly associated with CHD. Moreover, IL1B rs16944, PLA2G7 rs9395208, and CD40 rs1800686 were shown to be associated with CHD under the dominant model. Further gender-based subgroup tests showed that one SNP (CD40 rs1800686) and two SNPs (FAM5C rs12732361 and CD36 rs2065666) were associated with CHD in females and males, respectively. And the age-based subgroup tests indicated that PLA2G7 rs9395208, IL1B rs16944, and CD40 rs1800686 were associated with CHD among individuals younger than 55, younger than 65, and over 65, respectively. CONCLUSIONS: In conclusion, all the six inflammation-related CpG-SNPs (rs16944, rs2071008, rs12732361, rs2065666, rs9395208, and rs1800686) were associated with CHD in the combined or subgroup tests, suggesting an important role of inflammation in the risk of CHD.


Assuntos
Doença das Coronárias/genética , Ilhas de CpG/genética , Predisposição Genética para Doença/genética , Inflamação/genética , Polimorfismo de Nucleotídeo Único , 1-Alquil-2-acetilglicerofosfocolina Esterase/genética , Idoso , Povo Asiático/genética , Antígenos CD36/genética , Antígenos CD40/genética , China , Doença das Coronárias/etnologia , Proteínas de Ligação a DNA/genética , Feminino , Frequência do Gene , Predisposição Genética para Doença/etnologia , Genótipo , Humanos , Inflamação/etnologia , Interleucina-1beta/genética , Desequilíbrio de Ligação , Masculino , Pessoa de Meia-Idade , Razão de Chances , Receptores Tipo II de Interleucina-1/genética , Fatores de Risco
6.
Hum Genomics ; 10 Suppl 2: 22, 2016 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-27461247

RESUMO

BACKGROUND: Snail is a typical transcription factor that could induce epithelial-mesenchymal transition (EMT) and cancer progression. There are some related reports about the clinical significance of snail protein expression in gastric cancer. However, the published results were not completely consistent. This study was aimed to investigate snail expression and clinical significance in gastric cancer. RESULTS: A systematic review of PubMed, CNKI, Weipu, and Wanfang database before March 2015 was conducted. We established an inclusion criterion according to subjects, method of detection, and results evaluation of snail protein. Meta-analysis was conducted using RevMan4.2 software. And merged odds ratio (OR) and 95 % CI (95 % confidence interval) were calculated. Also, forest plots and funnel plot were used to assess the potential of publication bias. A total of 10 studies were recruited. The meta-analysis was conducted to evaluate the positive rate of snail protein expression. OR and 95 % CI for different groups were listed below: (1) gastric cancer and para-carcinoma tissue [OR = 6.15, 95 % CI (4.70, 8.05)]; (2) gastric cancer and normal gastric tissue [OR = 17.00, 95 % CI (10.08, 28.67)]; (3) non-lymph node metastasis and lymph node metastasis [OR = 0.40, 95 % CI (0.18, 0.93)]; (4) poor differentiated cancer, highly differentiated cancer, and moderate cancer [OR = 3.34, 95 % CI (2.22, 5.03)]; (5) clinical stage TI + TII and stage TIII + TIV [OR = 0.38, 95 % CI (0.23, 0.60)]; (6) superficial muscularis and deep muscularis [OR = 0.18, 95 % CI (0.11, 0.31)]. CONCLUSIONS: Our results indicated that the increase of snail protein expression may play an important role in the carcinogenesis, progression, and metastasis of gastric cancer. And this result might provide instruction for the diagnosis, therapy, and prognosis of gastric cancer.


Assuntos
Mucosa Gástrica/metabolismo , Regulação Neoplásica da Expressão Gênica , Fatores de Transcrição da Família Snail/genética , Neoplasias Gástricas/genética , Redes Reguladoras de Genes , Humanos , Metástase Linfática , Invasividade Neoplásica , Estadiamento de Neoplasias , Razão de Chances , Prognóstico , Transdução de Sinais/genética , Fatores de Transcrição da Família Snail/metabolismo , Estômago/patologia , Neoplasias Gástricas/diagnóstico , Neoplasias Gástricas/metabolismo
7.
BMC Genomics ; 15 Suppl 11: S3, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25559244

RESUMO

BACKGROUND: Epigallocatechin-3-gallate (EGCG) has been demonstrated to inhibit cancer in experimental studies through its antioxidant activity and modulations on cellular functions by binding specific proteins. We demonstrated previously that EGCG upregulates the expression of microRNA (i.e. miR-210) by binding HIF-1α, resulting in reduced cell proliferation and anchorage-independent growth. However, the binding affinities of EGCG to HIF-1α and many other targets are higher than the EGCG plasma peak level in experimental animals administered with high dose of EGCG, raising a concern whether the microRNA regulation by HIF-1α is involved in the anti-cancer activity of EGCG in vivo. RESULTS: We employed functional genomic approaches to elucidate the role of microRNA in the EGCG inhibition of tobacco carcinogen-induced lung tumors in A/J mice. By analysing the microRNA profiles, we found modest changes in the expression levels of 21 microRNAs. By correlating these 21 microRNAs with the mRNA expression profiles using the computation methods, we identified 26 potential targeted genes of the 21 microRNAs. Further exploration using pathway analysis revealed that the most impacted pathways of EGCG treatment are the regulatory networks associated to AKT, NF-κB, MAP kinases, and cell cycle, and the identified miRNA targets are involved in the networks of AKT, MAP kinases and cell cycle regulation CONCLUSIONS: These results demonstrate that the miRNA-mediated regulation is actively involved in the major aspects of the anti-cancer activity of EGCG in vivo.


Assuntos
Anticarcinógenos/farmacologia , Camellia sinensis/química , Catequina/análogos & derivados , Regulação da Expressão Gênica/efeitos dos fármacos , Neoplasias Pulmonares/tratamento farmacológico , MicroRNAs/metabolismo , Polifenóis/farmacologia , Animais , Carcinógenos , Catequina/farmacologia , Proteínas de Ciclo Celular/metabolismo , Feminino , Neoplasias Pulmonares/induzido quimicamente , Neoplasias Pulmonares/patologia , Camundongos , Proteínas Quinases Ativadas por Mitógeno/metabolismo , NF-kappa B/metabolismo , Nitrosaminas , Proteínas Proto-Oncogênicas c-akt/metabolismo
8.
BMC Genomics ; 15 Suppl 11: I1, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25558922

RESUMO

Synergistically integrating multi-layer genomic data at systems level not only can lead to deeper insights into the molecular mechanisms related to disease initiation and progression, but also can guide pathway-based biomarker and drug target identification. With the advent of high-throughput next-generation sequencing technologies, sequencing both DNA and RNA has generated multi-layer genomic data that can provide DNA polymorphism, non-coding RNA, messenger RNA, gene expression, isoform and alternative splicing information. Systems biology on the other hand studies complex biological systems, particularly systematic study of complex molecular interactions within specific cells or organisms. Genomics and molecular systems biology can be merged into the study of genomic profiles and implicated biological functions at cellular or organism level. The prospectively emerging field can be referred to as systems genomics or genomic systems biology. The Mid-South Bioinformatics Centre (MBC) and Joint Bioinformatics Ph.D. Program of University of Arkansas at Little Rock and University of Arkansas for Medical Sciences are particularly interested in promoting education and research advancement in this prospectively emerging field. Based on past investigations and research outcomes, MBC is further utilizing differential gene and isoform/exon expression from RNA-seq and co-regulation from the ChiP-seq specific for different phenotypes in combination with protein-protein interactions, and protein-DNA interactions to construct high-level gene networks for an integrative genome-phoneme investigation at systems biology level.


Assuntos
Pesquisa em Genética , Genômica , Biologia de Sistemas
9.
BMC Genomics ; 15 Suppl 11: S1, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25559034

RESUMO

BACKGROUND: RDX is a well-known pollutant to induce neurotoxicity. MicroRNAs (miRNA) and messenger RNA (mRNA) profiles are useful tools for toxicogenomics studies. It is worthy to integrate MiRNA and mRNA expression data to understand RDX-induced neurotoxicity. RESULTS: Rats were treated with or without RDX for 48 h. Both miRNA and mRNA profiles were conducted using brain tissues. Nine miRNAs were significantly regulated by RDX. Of these, 6 and 3 miRNAs were up- and down-regulated respectively. The putative target genes of RDX-regulated miRNAs were highly nervous system function genes and pathways enriched. Fifteen differentially genes altered by RDX from mRNA profiles were the putative targets of regulated miRNAs. The induction of miR-71, miR-27ab, miR-98, and miR-135a expression by RDX, could reduce the expression of the genes POLE4, C5ORF13, SULF1 and ROCK2, and eventually induce neurotoxicity. Over-expression of miR-27ab, or reduction of the expression of unknown miRNAs by RDX, could up-regulate HMGCR expression and contribute to neurotoxicity. RDX regulated immune and inflammation response miRNAs and genes could contribute to RDX- induced neurotoxicity and other toxicities as well as animal defending reaction response to RDX exposure. CONCLUSIONS: Our results demonstrate that integrating miRNA and mRNA profiles is valuable to indentify novel biomarkers and molecular mechanisms for RDX-induced neurological disorder and neurotoxicity.


Assuntos
Encéfalo/efeitos dos fármacos , Poluentes Ambientais/toxicidade , Perfilação da Expressão Gênica , MicroRNAs/metabolismo , RNA Mensageiro/metabolismo , Triazinas/toxicidade , Animais , Biomarcadores/metabolismo , Encéfalo/metabolismo , Biologia Computacional , Feminino , Inflamação/genética , Inflamação/metabolismo , Síndromes Neurotóxicas/genética , Síndromes Neurotóxicas/metabolismo , Ratos Sprague-Dawley , Transdução de Sinais
10.
Sci Rep ; 14(1): 3946, 2024 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-38365936

RESUMO

The advent of single-cell RNA sequencing (scRNA-seq) technology has revolutionized our ability to explore cellular diversity and unravel the complexities of intricate diseases. However, due to the inherently low signal-to-noise ratio and the presence of an excessive number of missing values, scRNA-seq data analysis encounters unique challenges. Here, we present cnnImpute, a novel convolutional neural network (CNN) based method designed to address the issue of missing data in scRNA-seq. Our approach starts by estimating missing probabilities, followed by constructing a CNN-based model to recover expression values with a high likelihood of being missing. Through comprehensive evaluations, cnnImpute demonstrates its effectiveness in accurately imputing missing values while preserving the integrity of cell clusters in scRNA-seq data analysis. It achieved superior performance in various benchmarking experiments. cnnImpute offers an accurate and scalable method for recovering missing values, providing a useful resource for scRNA-seq data analysis.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Sequenciamento do Exoma , Probabilidade , Análise por Conglomerados , RNA
11.
PeerJ ; 9: e10549, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33665002

RESUMO

Alzheimer's disease (AD) is a progressive neurodegenerative disorder, accounting for nearly 60% of all dementia cases. The occurrence of the disease has been increasing rapidly in recent years. Presently about 46.8 million individuals suffer from AD worldwide. The current absence of effective treatment to reverse or stop AD progression highlights the importance of disease prevention and early diagnosis. Brain structural Magnetic Resonance Imaging (MRI) has been widely used for AD detection as it can display morphometric differences and cerebral structural changes. In this study, we built three machine learning-based MRI data classifiers to predict AD and infer the brain regions that contribute to disease development and progression. We then systematically compared the three distinct classifiers, which were constructed based on Support Vector Machine (SVM), 3D Very Deep Convolutional Network (VGGNet) and 3D Deep Residual Network (ResNet), respectively. To improve the performance of the deep learning classifiers, we applied a transfer learning strategy. The weights of a pre-trained model were transferred and adopted as the initial weights of our models. Transferring the learned features significantly reduced training time and increased network efficiency. The classification accuracy for AD subjects from elderly control subjects was 90%, 95%, and 95% for the SVM, VGGNet and ResNet classifiers, respectively. Gradient-weighted Class Activation Mapping (Grad-CAM) was employed to show discriminative regions that contributed most to the AD classification by utilizing the learned spatial information of the 3D-VGGNet and 3D-ResNet models. The resulted maps consistently highlighted several disease-associated brain regions, particularly the cerebellum which is a relatively neglected brain region in the present AD study. Overall, our comparisons suggested that the ResNet model provided the best classification performance as well as more accurate localization of disease-associated regions in the brain compared to the other two approaches.

12.
Genes (Basel) ; 12(12)2021 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-34946850

RESUMO

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that impedes patients' cognition, social, speech and communication skills. ASD is highly heterogeneous with a variety of etiologies and clinical manifestations. The prevalence rate of ASD increased steadily in recent years. Presently, molecular mechanisms underlying ASD occurrence and development remain to be elucidated. Here, we integrated multi-layer genomics data to investigate the transcriptome and pathway dysregulations in ASD development. The RNA sequencing (RNA-seq) expression profiles of induced pluripotent stem cells (iPSCs), neural progenitor cells (NPCs) and neuron cells from ASD and normal samples were compared in our study. We found that substantially more genes were differentially expressed in the NPCs than the iPSCs. Consistently, gene set variation analysis revealed that the activity of the known ASD pathways in NPCs and neural cells were significantly different from the iPSCs, suggesting that ASD occurred at the early stage of neural system development. We further constructed comprehensive brain- and neural-specific regulatory networks by incorporating transcription factor (TF) and gene interactions with long 5 non-coding RNA(lncRNA) and protein interactions. We then overlaid the transcriptomes of different cell types on the regulatory networks to infer the regulatory cascades. The variations of the regulatory cascades between ASD and normal samples uncovered a set of novel disease-associated genes and gene interactions, particularly highlighting the functional roles of ELF3 and the interaction between STAT1 and lncRNA ELF3-AS 1 in the disease development. These new findings extend our understanding of ASD and offer putative new therapeutic targets for further studies.


Assuntos
Transtorno do Espectro Autista/genética , Redes Reguladoras de Genes/genética , Neurônios/patologia , Transtorno do Espectro Autista/patologia , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica/genética , Humanos , Células-Tronco Pluripotentes Induzidas/patologia , Células-Tronco Neurais/patologia , Organogênese/genética , Análise de Sequência de RNA/métodos , Fatores de Transcrição/genética , Transcriptoma/genética
13.
BMC Genomics ; 10 Suppl 1: S1, 2009 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-19594868

RESUMO

BACKGROUND: Protein-DNA interactions are involved in many biological processes essential for cellular function. To understand the molecular mechanism of protein-DNA recognition, it is necessary to identify the DNA-binding residues in DNA-binding proteins. However, structural data are available for only a few hundreds of protein-DNA complexes. With the rapid accumulation of sequence data, it becomes an important but challenging task to accurately predict DNA-binding residues directly from amino acid sequence data. RESULTS: A new machine learning approach has been developed in this study for predicting DNA-binding residues from amino acid sequence data. The approach used both the labelled data instances collected from the available structures of protein-DNA complexes and the abundant unlabeled data found in protein sequence databases. The evolutionary information contained in the unlabeled sequence data was represented as position-specific scoring matrices (PSSMs) and several new descriptors. The sequence-derived features were then used to train random forests (RFs), which could handle a large number of input variables and avoid model overfitting. The use of evolutionary information was found to significantly improve classifier performance. The RF classifier was further evaluated using a separate test dataset, and the predicted DNA-binding residues were examined in the context of three-dimensional structures. CONCLUSION: The results suggest that the RF-based approach gives rise to more accurate prediction of DNA-binding residues than previous studies. A new web server called BindN-RF http://bioinfo.ggc.org/bindn-rf/ has thus been developed to make the RF classifier accessible to the biological research community.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Proteínas de Ligação a DNA/metabolismo , Análise de Sequência de Proteína/métodos , Algoritmos , Sítios de Ligação , Curva ROC , Software
14.
BMC Genomics ; 10 Suppl 1: I1, 2009 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-19594867

RESUMO

The advent of high-throughput next generation sequencing technologies have fostered enormous potential applications of supercomputing techniques in genome sequencing, epi-genetics, metagenomics, personalized medicine, discovery of non-coding RNAs and protein-binding sites. To this end, the 2008 International Conference on Bioinformatics and Computational Biology (Biocomp) - 2008 World Congress on Computer Science, Computer Engineering and Applied Computing (Worldcomp) was designed to promote synergistic inter/multidisciplinary research and education in response to the current research trends and advances. The conference attracted more than two thousand scientists, medical doctors, engineers, professors and students gathered at Las Vegas, Nevada, USA during July 14-17 and received great success. Supported by International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design (IJCBDD), International Journal of Functional Informatics and Personalized Medicine (IJFIPM) and the leading research laboratories from Harvard, M.I.T., Purdue, UIUC, UCLA, Georgia Tech, UT Austin, U. of Minnesota, U. of Iowa etc, the conference received thousands of research papers. Each submitted paper was reviewed by at least three reviewers and accepted papers were required to satisfy reviewers' comments. Finally, the review board and the committee decided to select only 19 high-quality research papers for inclusion in this supplement to BMC Genomics based on the peer reviews only. The conference committee was very grateful for the Plenary Keynote Lectures given by: Dr. Brian D. Athey (University of Michigan Medical School), Dr. Vladimir N. Uversky (Indiana University School of Medicine), Dr. David A. Patterson (Member of United States National Academy of Sciences and National Academy of Engineering, University of California at Berkeley) and Anousheh Ansari (Prodea Systems, Space Ambassador). The theme of the conference to promote synergistic research and education has been achieved successfully.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/tendências , Congressos como Assunto
15.
BMC Genomics ; 10 Suppl 1: S3, 2009 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-19594880

RESUMO

INTRODUCTION: In the classification of Mass Spectrometry (MS) proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS) data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. RESULTS: We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE), and a recently developed method, Gradient based Leave-one-out Gene Selection (GLGS) that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC), Naïve Bayes Classifier (NBC), Nearest Mean Scaled Classifier (NMSC), uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN) based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. CONCLUSION: Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing accuracy. However, the distance metric learning LMNN outperformed SVMs and other classifiers on evaluating the best testing. In such cases, the optimum classification model based on LMNN is worth investigating for future study.


Assuntos
Inteligência Artificial , Modelos Estatísticos , Reconhecimento Automatizado de Padrão/métodos , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz , Algoritmos , Biologia Computacional/métodos , Análise de Sequência com Séries de Oligonucleotídeos , Proteômica
16.
BMC Genomics ; 10 Suppl 1: S18, 2009 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-19594877

RESUMO

BACKGROUND: DNA repair genes provide an important contribution towards the surveillance and repair of DNA damage. These genes produce a large network of interacting proteins whose mRNA expression is likely to be regulated by similar regulatory factors. Full characterization of promoters of DNA repair genes and the similarities among them will more fully elucidate the regulatory networks that activate or inhibit their expression. To address this goal, the authors introduce a technique to find regulatory genomic signatures, which represents a specific application of the genomic signature methodology to classify DNA sequences as putative functional elements within a single organism. RESULTS: The effectiveness of the regulatory genomic signatures is demonstrated via analysis of promoter sequences for genes in DNA repair pathways of humans. The promoters are divided into two classes, the bidirectional promoters and the unidirectional promoters, and distinct genomic signatures are calculated for each class. The genomic signatures include statistically overrepresented words, word clusters, and co-occurring words. The robustness of this method is confirmed by the ability to identify sequences that exist as motifs in TRANSFAC and JASPAR databases, and in overlap with verified binding sites in this set of promoter regions. CONCLUSION: The word-based signatures are shown to be effective by finding occurrences of known regulatory sites. Moreover, the signatures of the bidirectional and unidirectional promoters of human DNA repair pathways are clearly distinct, exhibiting virtually no overlap. In addition to providing an effective characterization method for related DNA sequences, the signatures elucidate putative regulatory aspects of DNA repair pathways, which are notably under-characterized.


Assuntos
Biologia Computacional/métodos , Reparo do DNA , Regiões Promotoras Genéticas , Composição de Bases , Análise por Conglomerados , Bases de Dados Genéticas , Humanos , Modelos Estatísticos
17.
Neural Process Lett ; 50(1): 103-119, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35035261

RESUMO

Automatically describing contents of an image using natural language has drawn much attention because it not only integrates computer vision and natural language processing but also has practical applications. Using an end-to-end approach, we propose a bidirectional semantic attention-based guiding of long short-term memory (Bag-LSTM) model for image captioning. The proposed model consciously refines image features from previously generated text. By fine-tuning the parameters of convolution neural networks, Bag-LSTM obtains more text-related image features via feedback propagation than other models. As opposed to existing guidance-LSTM methods which directly add image features into each unit of an LSTM block, our fine-tuned model dynamically leverages more text-conditional image features, acquired by the semantic attention mechanism, as guidance information. Moreover, we exploit bidirectional gLSTM as the caption generator, which is capable of learning long term relations between visual features and semantic information by making use of both historical and future contextual information. In addition, variations of the Bag-LSTM model are proposed in an effort to sufficiently describe high-level visual-language interactions. Experiments on the Flickr8k and MSCOCO benchmark datasets demonstrate the effectiveness of the model, as compared with the baseline algorithms, such as it is 51.2% higher than BRNN on CIDEr metric.

18.
J Ambient Intell Humaniz Comput ; 10(5): 2029-2040, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-31068980

RESUMO

With the massive volume and rapid increasing of data, feature space study is of great importance. To avoid the complex training processes in deep learning models which project original feature space into low-dimensional ones, we propose a novel feature space learning (FSL) model. The main contributions in our approach are: (1) FSL can not only select useful features but also adaptively update feature values and span new feature spaces; (2) four FSL algorithms are proposed with the feature space updating procedure; (3) FSL can provide a better data understanding and learn descriptive and compact feature spaces without the tough training for deep architectures. Experimental results on benchmark data sets demonstrate that FSL-based algorithms performed better than the classical unsupervised, semi-supervised learning and even incremental semi-supervised algorithms. In addition, we show a visualization of the learned feature space results. With the carefully designed learning strategy, FSL dynamically disentangles explanatory factors, depresses the noise accumulation and semantic shift, and constructs easy-to-understand feature spaces.

19.
BMC Syst Biol ; 13(1): 13, 2019 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-30670065

RESUMO

It was highlighted that the original article [1] contained a typesetting error in the last name of Allon Canaan. This was incorrectly captured as Allon Canaann in the original article which has since been updated.

20.
BMC Bioinformatics ; 9 Suppl 6: S9, 2008 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-18541062

RESUMO

BACKGROUND: Orthologous genes with deep phylogenetic histories are likely to retain similar regulatory features. In this report we utilize orthology assignments for pairs of genes co-regulated by bidirectional promoters to map the ancestral history of the promoter regions. RESULTS: Our mapping of bidirectional promoters from humans to fish shows that many such promoters emerged after the divergence of chickens and fish. Furthermore, annotations of promoters in deep phylogenies enable detection of missing data or assembly problems present in higher vertebrates. The functional importance of bidirectional promoters is indicated by selective pressure to maintain the arrangement of genes regulated by the promoter over long evolutionary time spans. Characteristics unique to bidirectional promoters are further elucidated using a technique for unsupervised classification, known as ESPERR. CONCLUSION: Results of these analyses will aid in our understanding of the evolution of bidirectional promoters, including whether the regulation of two genes evolved as a consequence of their proximity or if function dictated their co-regulation.


Assuntos
Evolução Biológica , Mapeamento Cromossômico/métodos , Evolução Molecular , Regiões Promotoras Genéticas/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Vertebrados/genética , Algoritmos , Animais , Sequência de Bases , Dados de Sequência Molecular , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa