Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 65
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Transl Med ; 22(1): 383, 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38659028

RESUMO

BACKGROUND: Loss of AZGP1 expression is a biomarker associated with progression to castration resistance, development of metastasis, and poor disease-specific survival in prostate cancer. However, high expression of AZGP1 cells in prostate cancer has been reported to increase proliferation and invasion. The exact role of AZGP1 in prostate cancer progression remains elusive. METHOD: AZGP1 knockout and overexpressing prostate cancer cells were generated using a lentiviral system. The effects of AZGP1 under- or over-expression in prostate cancer cells were evaluated by in vitro cell proliferation, migration, and invasion assays. Heterozygous AZGP1± mice were obtained from European Mouse Mutant Archive (EMMA), and prostate tissues from homozygous knockout male mice were collected at 2, 6 and 10 months for histological analysis. In vivo xenografts generated from AZGP1 under- or over-expressing prostate cancer cells were used to determine the role of AZGP1 in prostate cancer tumor growth, and subsequent proteomics analysis was conducted to elucidate the mechanisms of AZGP1 action in prostate cancer progression. AZGP1 expression and microvessel density were measured in human prostate cancer samples on a tissue microarray of 215 independent patient samples. RESULT: Neither the knockout nor overexpression of AZGP1 exhibited significant effects on prostate cancer cell proliferation, clonal growth, migration, or invasion in vitro. The prostates of AZGP1-/- mice initially appeared to have grossly normal morphology; however, we observed fibrosis in the periglandular stroma and higher blood vessel density in the mouse prostate by 6 months. In PC3 and DU145 mouse xenografts, over-expression of AZGP1 did not affect tumor growth. Instead, these tumors displayed decreased microvessel density compared to xenografts derived from PC3 and DU145 control cells, suggesting that AZGP1 functions to inhibit angiogenesis in prostate cancer. Proteomics profiling further indicated that, compared to control xenografts, AZGP1 overexpressing PC3 xenografts are enriched with angiogenesis pathway proteins, including YWHAZ, EPHA2, SERPINE1, and PDCD6, MMP9, GPX1, HSPB1, COL18A1, RNH1, and ANXA1. In vitro functional studies show that AZGP1 inhibits human umbilical vein endothelial cell proliferation, migration, tubular formation and branching. Additionally, tumor microarray analysis shows that AZGP1 expression is negatively correlated with blood vessel density in human prostate cancer tissues. CONCLUSION: AZGP1 is a negative regulator of angiogenesis, such that loss of AZGP1 promotes angiogenesis in prostate cancer. AZGP1 likely exerts heterotypical effects on cells in the tumor microenvironment, such as stromal and endothelial cells. This study sheds light on the anti-angiogenic characteristics of AZGP1 in the prostate and provides a rationale to target AZGP1 to inhibit prostate cancer progression.


Assuntos
Movimento Celular , Proliferação de Células , Neovascularização Patológica , Neoplasias da Próstata , Masculino , Animais , Neoplasias da Próstata/patologia , Neoplasias da Próstata/genética , Neoplasias da Próstata/metabolismo , Humanos , Neovascularização Patológica/genética , Neovascularização Patológica/patologia , Linhagem Celular Tumoral , Camundongos Knockout , Glicoproteínas/metabolismo , Invasividade Neoplásica , Camundongos , Regulação Neoplásica da Expressão Gênica , Angiogênese , Glicoproteína Zn-alfa-2
2.
BMC Bioinformatics ; 22(1): 35, 2021 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-33516170

RESUMO

BACKGROUND: Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow. RESULTS: We validate the performance of the SOM-VN workflow on 14 different samples of varying quality, namely one assay each of A549 and GM12878 cell lines and two each of H1 and HeLa cell lines, primary B-cells, and brain, heart, and stomach tissue. We show that SOM-VN learns shapes that are (1) non-random, (2) associated with known chromatin states, (3) generalizable across sets of chromosomes, and (4) associated with magnitude and multimodality. We compare the accuracy of SOM-VN chromatin states against the Clustering Aggregation Tool (CAGT), an unsupervised method that learns chromatin accessibility signal shapes but does not associate these shapes with REs, and we show that overall precision and recall is increased when learning shapes using SOM-VN as compared to CAGT. We further compare enhancer state assignments from SOM-VN in signals above a set threshold to enhancer state assignments from Predicting Enhancers from ATAC-seq Data (PEAS), a deep learning method that assigns enhancer chromatin states to peaks. We show that the precision-recall area under the curve for the assignment of enhancer states is comparable to PEAS. CONCLUSIONS: Our work shows that the SOM-VN workflow can learn relationships between REs and chromatin accessibility signal shape, which is an important step toward the goal of assigning and comparing enhancer state across multiple experiments and phenotypic states.


Assuntos
Cromatina , Elementos Facilitadores Genéticos , Regiões Promotoras Genéticas , Adulto , Algoritmos , Pré-Escolar , Cromatina/genética , Células HeLa , Humanos , Adulto Jovem
3.
Bioinformatics ; 35(10): 1653-1659, 2019 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-30329022

RESUMO

MOTIVATION: Technologies that generate high-throughput omics data are flourishing, creating enormous, publicly available repositories of multi-omics data. As many data repositories continue to grow, there is an urgent need for computational methods that can leverage these data to create comprehensive clusters of patients with a given disease. RESULTS: Our proposed approach creates a patient-to-patient similarity graph for each data type as an intermediate representation of each omics data type and merges the graphs through subspace analysis on a Grassmann manifold. We hypothesize that this approach generates more informative clusters by preserving the complementary information from each level of omics data. We applied our approach to The Cancer Genome Atlas (TCGA) breast cancer dataset and show that by integrating gene expression, microRNA and DNA methylation data, our proposed method can produce clinically useful subtypes of breast cancer. We then investigate the molecular characteristics underlying these subtypes. We discover a highly expressed cluster of genes on chromosome 19p13 that strongly correlates with survival in TCGA breast cancer patients and validate these results in three additional breast cancer datasets. We also compare our approach with previous integrative clustering approaches and obtain comparable or superior results. AVAILABILITY AND IMPLEMENTATION: https://github.com/michaelsharpnack/GrassmannCluster. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Neoplasias da Mama , Análise por Conglomerados , Metilação de DNA , Genoma , Humanos
4.
BMC Bioinformatics ; 20(Suppl 24): 669, 2019 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861998

RESUMO

BACKGROUND: Proteomic measurements, which closely reflect phenotypes, provide insights into gene expression regulations and mechanisms underlying altered phenotypes. Further, integration of data on proteome and transcriptome levels can validate gene signatures associated with a phenotype. However, proteomic data is not as abundant as genomic data, and it is thus beneficial to use genomic features to predict protein abundances when matching proteomic samples or measurements within samples are lacking. RESULTS: We evaluate and compare four data-driven models for prediction of proteomic data from mRNA measured in breast and ovarian cancers using the 2017 DREAM Proteogenomics Challenge data. Our results show that Bayesian network, random forests, LASSO, and fuzzy logic approaches can predict protein abundance levels with median ground truth-predicted correlation values between 0.2 and 0.5. However, the most accurately predicted proteins differ considerably between approaches. CONCLUSIONS: In addition to benchmarking aforementioned machine learning approaches for predicting protein levels from transcript levels, we discuss challenges and potential solutions in state-of-the-art proteogenomic analyses.


Assuntos
Proteogenômica , Teorema de Bayes , Regulação da Expressão Gênica , Humanos , Proteoma/análise , RNA Mensageiro/genética , Transcriptoma
5.
Bioinformatics ; 33(10): 1570-1571, 2017 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-28169395

RESUMO

SUMMARY: We developed annoPeak, a web application to annotate, visualize and compare predicted protein-binding regions derived from ChIP-seq/ChIP-exo-seq experiments using human and mouse cells. Users can upload peak regions from multiple experiments onto the annoPeak server to annotate them with biological context, identify associated target genes and categorize binding sites with respect to gene structure. Users can also compare multiple binding profiles intuitively with the help of visualization tools and tables provided by annoPeak. In general, annoPeak will help users identify patterns of genome wide transcription factor binding profiles, assess binding profiles in different biological contexts and generate new hypotheses. AVAILABILITY AND IMPLEMENTATION: The web service is freely accessible through URL: http://ccc-annopeak.osumc.edu/annoPeak . Source code is available at https://github.com/XingTang2014/annoPeak . CONTACT: gustavo.leone@osumc.edu or kun.huang@osumc.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Imunoprecipitação da Cromatina/métodos , DNA/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Humanos , Camundongos , Regiões Promotoras Genéticas , Ligação Proteica , Análise de Sequência de DNA/métodos
6.
Methods ; 115: 65-79, 2017 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-28242295

RESUMO

Advances in optical microscopy, biosensors and cell culturing technologies have transformed live cell imaging. Thanks to these advances live cell imaging plays an increasingly important role in basic biology research as well as at all stages of drug development. Image analysis methods are needed to extract quantitative information from these vast and complex data sets. The aim of this review is to provide an overview of available image analysis methods for live cell imaging, in particular required preprocessing image segmentation, cell tracking and data visualisation methods. The potential opportunities recent advances in machine learning, especially deep learning, and computer vision provide are being discussed. This review includes overview of the different available software packages and toolkits.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Microscopia/métodos , Imagem Molecular/métodos , Software , Animais , Técnicas Biossensoriais/instrumentação , Técnicas Biossensoriais/métodos , Técnicas de Cultura de Células , Rastreamento de Células/instrumentação , Rastreamento de Células/métodos , Células Eucarióticas/metabolismo , Células Eucarióticas/ultraestrutura , Humanos , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Microscopia/instrumentação , Imagem Molecular/instrumentação , Razão Sinal-Ruído
7.
PLoS Comput Biol ; 12(4): e1004892, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-27100869

RESUMO

Co-expression analysis has been employed to predict gene function, identify functional modules, and determine tumor subtypes. Previous co-expression analysis was mainly conducted at bulk tissue level. It is unclear whether co-expression analysis at the single-cell level will provide novel insights into transcriptional regulation. Here we developed a computational approach to compare glioblastoma expression profiles at the single-cell level with those obtained from bulk tumors. We found that the co-expressed genes observed in single cells and bulk tumors have little overlap and show distinct characteristics. The co-expressed genes identified in bulk tumors tend to have similar biological functions, and are enriched for intrachromosomal interactions with synchronized promoter activity. In contrast, single-cell co-expressed genes are enriched for known protein-protein interactions, and are regulated through interchromosomal interactions. Moreover, gene members of some protein complexes are co-expressed only at the bulk level, while those of other complexes are co-expressed at both single-cell and bulk levels. Finally, we identified a set of co-expressed genes that can predict the survival of glioblastoma patients. Our study highlights that comparative analyses of single-cell and bulk gene expression profiles enable us to identify functional modules that are regulated at different levels and hold great translational potential.


Assuntos
Glioblastoma/genética , Análise de Célula Única/estatística & dados numéricos , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Neoplasias Encefálicas/patologia , Biologia Computacional , Simulação por Computador , Glioblastoma/metabolismo , Glioblastoma/patologia , Humanos , Masculino , Modelos Genéticos , Família Multigênica , Prognóstico , Neoplasias da Próstata/genética , Neoplasias da Próstata/metabolismo , Neoplasias da Próstata/patologia , Mapas de Interação de Proteínas/genética , Transcriptoma
8.
BMC Med Inform Decis Mak ; 17(Suppl 2): 65, 2017 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-28699545

RESUMO

BACKGROUND: We develop predictive models enabling clinicians to better understand and explore patient clinical data along with risk factors for pressure ulcers in intensive care unit patients from electronic health record data. Identifying accurate risk factors of pressure ulcers is essential to determining appropriate prevention strategies; in this work we examine medication, diagnosis, and traditional Braden pressure ulcer assessment scale measurements as patient features. In order to predict pressure ulcer incidence and better understand the structure of related risk factors, we construct Bayesian networks from patient features. Bayesian network nodes (features) and edges (conditional dependencies) are simplified with statistical network techniques. Upon reviewing a network visualization of our model, our clinician collaborators were able to identify strong relationships between risk factors widely recognized as associated with pressure ulcers. METHODS: We present a three-stage framework for predictive analysis of patient clinical data: 1) Developing electronic health record feature extraction functions with assistance of clinicians, 2) simplifying features, and 3) building Bayesian network predictive models. We evaluate all combinations of Bayesian network models from different search algorithms, scoring functions, prior structure initializations, and sets of features. RESULTS: From the EHRs of 7,717 ICU patients, we construct Bayesian network predictive models from 86 medication, diagnosis, and Braden scale features. Our model not only identifies known and suspected high PU risk factors, but also substantially increases sensitivity of the prediction - nearly three times higher comparing to logistical regression models - without sacrificing the overall accuracy. We visualize a representative model with which our clinician collaborators identify strong relationships between risk factors widely recognized as associated with pressure ulcers. CONCLUSIONS: Given the strong adverse effect of pressure ulcers on patients and the high cost for treating pressure ulcers, our Bayesian network based model provides a novel framework for significantly improving the sensitivity of the prediction model. Thus, when the model is deployed in a clinical setting, the caregivers can suitably respond to conditions likely associated with pressure ulcer incidence.


Assuntos
Teorema de Bayes , Registros Eletrônicos de Saúde/estatística & dados numéricos , Unidades de Terapia Intensiva/estatística & dados numéricos , Modelos Estatísticos , Úlcera por Pressão , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Úlcera por Pressão/diagnóstico , Úlcera por Pressão/epidemiologia , Úlcera por Pressão/terapia , Fatores de Risco , Adulto Jovem
9.
BMC Bioinformatics ; 17(Suppl 17): 534, 2016 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-28155643

RESUMO

BACKGROUND: Identification and analysis of recurrent combinatorial patterns of multiple chromatin modifications provide invaluable information for understanding epigenetic regulations. Furthermore, as more data becomes available, it is computationally expensive and unnecessary to study combinatorial patterns of all modifications. METHODS: A novel framework is proposed to investigate recurrent combinatorial patterns of a subset of quantitatively selected chromatin modifications. The framework is based on heirarchical clustering and selects subsets of chromatin modifications that form distinct recurrent patterns at regulatory regions. The identified recurrent combinatorial patterns can be further utilized to discover novel regulatory regions. Data is in the form of genome wide maps of histone acetylations, methylations, and histone variant of human skeletal muscular and B-lymphocyte cells both derived from the ENCODE project. RESULTS: A case study conducted at promoter regions is presented: four out of twelve chromatin modifications were selected, eight different promoter states were identified and the identified patterns of active promoters were further utilized to discover novel promoter regions. Several previously un-annotated promoters were discovered, further investigations confirm their promoter functions. CONCLUSIONS: This framework is approproiately general and could lead to better understanding of epigenetic regulations by discovering previously unknown regulatory regions.


Assuntos
Cromatina/metabolismo , Biologia Computacional/métodos , Epigênese Genética , Genoma Humano , Sequências Reguladoras de Ácido Nucleico , Acetilação , Linfócitos B/metabolismo , Análise por Conglomerados , Histonas/metabolismo , Humanos , Metilação , Músculo Esquelético/metabolismo , Especificidade de Órgãos
10.
Methods ; 73: 54-70, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25524419

RESUMO

Studies of the brain's transcriptome have become prominent in recent years, resulting in an accumulation of datasets with somewhat distinct attributes. These datasets, which are often analyzed only in isolation, also are often collected with divergent goals, which are reflected in their sampling properties. While many researchers have been interested in sampling gene expression in one or a few brain areas in a large number of subjects, recent efforts from the Allen Institute for Brain Sciences and others have focused instead on dense neuroanatomical sampling, necessarily limiting the number of individual donor brains studied. The purpose of the present work is to develop methods that draw on the complementary strengths of these two types of datasets for study of the human brain, and to characterize the anatomical specificity of gene expression profiles and gene co-expression networks derived from human brains using different specific technologies. The approach is applied using two publicly accessible datasets: (1) the high anatomical resolution Allen Human Brain Atlas (AHBA, Hawrylycz et al., 2012) and (2) a relatively large sample size, but comparatively coarse neuroanatomical dataset described previously by Gibbs et al. (2010). We found a relatively high degree of correspondence in differentially expressed genes and regional gene expression profiles across the two datasets. Gene co-expression networks defined in individual brain regions were less congruent, but also showed modest anatomical specificity. Using gene modules derived from the Gibbs dataset and from curated gene lists, we demonstrated varying degrees of anatomical specificity based on two classes of methods, one focused on network modularity and the other focused on enrichment of expression levels. Two approaches to assessing the statistical significance of a gene set's modularity in a given brain region were studied, which provide complementary information about the anatomical specificity of a gene network of interest. Overall, the present work demonstrates the feasibility of cross-dataset analysis of human brain microarray studies, and offers a new approach to annotating gene lists in a neuroanatomical context.


Assuntos
Atlas como Assunto , Encéfalo/fisiologia , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Transcriptoma/genética , Encéfalo/anatomia & histologia , Bases de Dados Genéticas/estatística & dados numéricos , Redes Reguladoras de Genes/genética , Humanos , Estatística como Assunto/métodos
11.
BMC Bioinformatics ; 16 Suppl 11: S10, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26330277

RESUMO

BACKGROUND: Histology images comprise one of the important sources of knowledge for phenotyping studies in systems biology. However, the annotation and analyses of histological data have remained a manual, subjective and relatively low-throughput process. RESULTS: We introduce Graph based Histology Image Explorer (GRAPHIE)-a visual analytics tool to explore, annotate and discover potential relationships in histology image collections within a biologically relevant context. The design of GRAPHIE is guided by domain experts' requirements and well-known InfoVis mantras. By representing each image with informative features and then subsequently visualizing the image collection with a graph, GRAPHIE allows users to effectively explore the image collection. The features were designed to capture localized morphological properties in the given tissue specimen. More importantly, users can perform feature selection in an interactive way to improve the visualization of the image collection and the overall annotation process. Finally, the annotation allows for a better prospective examination of datasets as demonstrated in the users study. Thus, our design of GRAPHIE allows for the users to navigate and explore large collections of histology image datasets. CONCLUSIONS: We demonstrated the usefulness of our visual analytics approach through two case studies. Both of the cases showed efficient annotation and analysis of histology image collection.


Assuntos
Algoritmos , Biologia Computacional/métodos , Gráficos por Computador , Processamento de Imagem Assistida por Computador/métodos , Retina/citologia , Software , Peixe-Zebra/crescimento & desenvolvimento , Animais , Simulação por Computador , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Modelos Biológicos , Interface Usuário-Computador
12.
Methods ; 67(3): 304-12, 2014 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-24657666

RESUMO

Breast cancers are highly heterogeneous with different subtypes that lead to different clinical outcomes including prognosis, response to treatment and chances of recurrence and metastasis. An important task in personalized medicine is to determine the subtype for a breast cancer patient in order to provide the most effective treatment. In order to achieve this goal, integrative genomics approach has been developed recently with multiple modalities of large datasets ranging from genotypes to multiple levels of phenotypes. A major challenge in integrative genomics is how to effectively integrate multiple modalities of data to stratify the breast cancer patients. Consensus clustering algorithms have often been adopted for this purpose. However, existing consensus clustering algorithms are not suitable for the situation of integrating clustering results obtained from a mixture of numerical data and categorical data. In this work, we present a mathematical formulation for integrative clustering of multiple-source data including both numerical and categorical data to resolve the above issue. Specifically, we formulate the problem as a novel consensus clustering method called Molecular Regularized Consensus Patient Stratification (MRCPS) based on an optimization process with regularization. Unlike the traditional consensus clustering methods, MRCPS can automatically and spontaneously cluster both numerical and categorical data with any option of similarity metrics. We apply this new method by applying it on the TCGA breast cancer datasets and evaluate using both statistical criteria and clinical relevance on predicting prognosis. The result demonstrates the superiority of this method in terms of effectiveness of aggregation and differentiating patient outcomes. Our method, while motivated by the breast cancer research, is nevertheless universal for integrative genomics studies.


Assuntos
Neoplasias da Mama/patologia , Análise por Conglomerados , Algoritmos , Neoplasias da Mama/genética , Conjuntos de Dados como Assunto , Feminino , Humanos , Medicina de Precisão
13.
BMC Bioinformatics ; 15: 203, 2014 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-25000928

RESUMO

BACKGROUND: Cancers are highly heterogeneous with different subtypes. These subtypes often possess different genetic variants, present different pathological phenotypes, and most importantly, show various clinical outcomes such as varied prognosis and response to treatment and likelihood for recurrence and metastasis. Recently, integrative genomics (or panomics) approaches are often adopted with the goal of combining multiple types of omics data to identify integrative biomarkers for stratification of patients into groups with different clinical outcomes. RESULTS: In this paper we present a visual analytic system called Interactive Genomics Patient Stratification explorer (iGPSe) which significantly reduces the computing burden for biomedical researchers in the process of exploring complicated integrative genomics data. Our system integrates unsupervised clustering with graph and parallel sets visualization and allows direct comparison of clinical outcomes via survival analysis. Using a breast cancer dataset obtained from the The Cancer Genome Atlas (TCGA) project, we are able to quickly explore different combinations of gene expression (mRNA) and microRNA features and identify potential combined markers for survival prediction. CONCLUSIONS: Visualization plays an important role in the process of stratifying given population patients. Visual tools allowed for the selection of possibly features across various datasets for the given patient population. We essentially made a case for visualization for a very important problem in translational informatics.


Assuntos
Neoplasias da Mama/genética , Genômica/métodos , Software , Neoplasias da Mama/mortalidade , Regulação Neoplásica da Expressão Gênica , Humanos , MicroRNAs/genética , Recidiva Local de Neoplasia/genética , Prognóstico , RNA Mensageiro/genética , Análise de Sobrevida
15.
medRxiv ; 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38883716

RESUMO

Serum total immunoglobulin E levels (total IgE) capture the state of the immune system in relation to allergic sensitization. High levels are associated with airway obstruction and poor clinical outcomes in pediatric asthma. Inconsistent patient response to anti-IgE therapies motivates discovery of molecular mechanisms underlying serum IgE level differences in children with asthma. To uncover these mechanisms using complementary metabolomic and transcriptomic data, abundance levels of 529 named metabolites and expression levels of 22,772 genes were measured among children with asthma in the Childhood Asthma Management Program (CAMP, N=564) and the Genetic Epidemiology of Asthma in Costa Rica Study (GACRS, N=309) via the TOPMed initiative. Gene-metabolite associations dependent on IgE were identified within each cohort using multivariate linear models and were interpreted in a biochemical context using network topology, pathway and chemical enrichment, and representation within reactions. A total of 1,617 total IgE-dependent gene-metabolite associations from GACRS and 29,885 from CAMP met significance cutoffs. Of these, glycine and guanidinoacetic acid (GAA) were associated with the most genes in both cohorts, and the associations represented reactions central to glycine, serine, and threonine metabolism and arginine and proline metabolism. Pathway and chemical enrichment analysis further highlighted additional related pathways of interest. The results of this study suggest that GAA may modulate total IgE levels in two independent pediatric asthma cohorts with different characteristics, supporting the use of L-Arginine as a potential therapeutic for asthma exacerbation. Other potentially new targetable pathways are also uncovered.

16.
J Child Psychol Psychiatry ; 54(10): 1109-19, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23909413

RESUMO

BACKGROUND: Numerous studies have examined gene × environment interactions (G × E) in cognitive and behavioral domains. However, these studies have been limited in that they have not been able to directly assess differential patterns of gene expression in the human brain. Here, we assessed G × E interactions using two publically available datasets to assess if DNA variation is associated with post-mortem brain gene expression changes based on smoking behavior, a biobehavioral construct that is part of a complex system of genetic and environmental influences. METHODS: We conducted an expression quantitative trait locus (eQTL) study on two independent human brain gene expression datasets assessing G × E for selected psychiatric genes and smoking status. We employed linear regression to model the significance of the Gene × Smoking interaction term, followed by meta-analysis across datasets. RESULTS: Overall, we observed that the effect of DNA variation on gene expression is moderated by smoking status. Expression of 16 genes was significantly associated with single nucleotide polymorphisms that demonstrated G × E effects. The strongest finding (p = 1.9 × 10⁻¹¹) was neurexin 3-alpha (NRXN3), a synaptic cell-cell adhesion molecule involved in maintenance of neural connections (such as the maintenance of smoking behavior). Other significant G × E associations include four glutamate genes. CONCLUSIONS: This is one of the first studies to demonstrate G × E effects within the human brain. In particular, this study implicated NRXN3 in the maintenance of smoking. The effect of smoking on NRXN3 expression and downstream behavior is different based upon SNP genotype, indicating that DNA profiles based on SNPs could be useful in understanding the effects of smoking behaviors. These results suggest that better measurement of psychiatric conditions, and the environment in post-mortem brain studies may yield an important avenue for understanding the biological mechanisms of G × E interactions in psychiatry.


Assuntos
Lobo Frontal/metabolismo , Regulação da Expressão Gênica/genética , Interação Gene-Ambiente , Fumar/genética , Fumar/metabolismo , Adolescente , Adulto , Lobo Frontal/patologia , Humanos , Proteínas do Tecido Nervoso/genética , Vias Neurais/fisiologia , Fumar/psicologia , Adulto Jovem
17.
Bioinform Adv ; 3(1): vbad009, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36922980

RESUMO

Motivation: IntLIM uncovers phenotype-dependent linear associations between two types of analytes (e.g. genes and metabolites) in a multi-omic dataset, which may reflect chemically or biologically relevant relationships. Results: The new IntLIM R package includes newly added support for generalized data types, covariate correction, continuous phenotypic measurements, model validation and unit testing. IntLIM analysis uncovered biologically relevant gene-metabolite associations in two separate datasets, and the run time is improved over baseline R functions by multiple orders of magnitude. Availability and implementation: IntLIM is available as an R package with a detailed vignette (https://github.com/ncats/IntLIM) and as an R Shiny app (see Supplementary Figs S1-S6) (https://intlim.ncats.io/). Supplementary information: Supplementary data are available at Bioinformatics Advances online.

18.
Front Bioinform ; 3: 1296667, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38323039

RESUMO

Introduction: Prostate cancer is a highly heterogeneous disease, presenting varying levels of aggressiveness and response to treatment. Angiogenesis is one of the hallmarks of cancer, providing oxygen and nutrient supply to tumors. Micro vessel density has previously been correlated with higher Gleason score and poor prognosis. Manual segmentation of blood vessels (BVs) In microscopy images is challenging, time consuming and may be prone to inter-rater variabilities. In this study, an automated pipeline is presented for BV detection and distribution analysis in multiplexed prostate cancer images. Methods: A deep learning model was trained to segment BVs by combining CD31, CD34 and collagen IV images. In addition, the trained model was used to analyze the size and distribution patterns of BVs in relation to disease progression in a cohort of prostate cancer patients (N = 215). Results: The model was capable of accurately detecting and segmenting BVs, as compared to ground truth annotations provided by two reviewers. The precision (P), recall (R) and dice similarity coefficient (DSC) were equal to 0.93 (SD 0.04), 0.97 (SD 0.02) and 0.71 (SD 0.07) with respect to reviewer 1, and 0.95 (SD 0.05), 0.94 (SD 0.07) and 0.70 (SD 0.08) with respect to reviewer 2, respectively. BV count was significantly associated with 5-year recurrence (adjusted p = 0.0042), while both count and area of blood vessel were significantly associated with Gleason grade (adjusted p = 0.032 and 0.003 respectively). Discussion: The proposed methodology is anticipated to streamline and standardize BV analysis, offering additional insights into the biology of prostate cancer, with broad applicability to other cancers.

19.
BMC Bioinformatics ; 13 Suppl 2: S2, 2012 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-22536865

RESUMO

BACKGROUND: RNA polymerase II (PolII) is essential in gene transcription and ChIP-seq experiments have been used to study PolII binding patterns over the entire genome. However, since PolII enriched regions in the genome can be very long, existing peak finding algorithms for ChIP-seq data are not adequate for identifying such long regions. METHODS: Here we propose an enriched region detection method for ChIP-seq data to identify long enriched regions by combining a signal denoising algorithm with a false discovery rate (FDR) approach. The binned ChIP-seq data for PolII are first processed using a non-local means (NL-means) algorithm for purposes of denoising. Then, a FDR approach is developed to determine the threshold for marking enriched regions in the binned histogram. RESULTS: We first test our method using a public PolII ChIP-seq dataset and compare our results with published results obtained using the published algorithm HPeak. Our results show a high consistency with the published results (80-100%). Then, we apply our proposed method on PolII ChIP-seq data generated in our own study on the effects of hormone on the breast cancer cell line MCF7. The results demonstrate that our method can effectively identify long enriched regions in ChIP-seq datasets. Specifically, pertaining to MCF7 control samples we identified 5,911 segments with length of at least 4 Kbp (maximum 233,000 bp); and in MCF7 treated with E2 samples, we identified 6,200 such segments (maximum 325,000 bp). CONCLUSIONS: We demonstrated the effectiveness of this method in studying binding patterns of PolII in cancer cells which enables further deep analysis in transcription regulation and epigenetics. Our method complements existing peak detection algorithms for ChIP-seq experiments.


Assuntos
Algoritmos , Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala , RNA Polimerase II/análise , Análise de Sequência de DNA , Neoplasias da Mama/genética , Linhagem Celular Tumoral , Feminino , Genoma Humano , Humanos , Masculino , Neoplasias da Próstata/genética , Processamento de Sinais Assistido por Computador
20.
ACS Omega ; 7(11): 9465-9483, 2022 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-35350358

RESUMO

Recent advances in molecular machine learning, especially deep neural networks such as graph neural networks (GNNs), for predicting structure-activity relationships (SAR) have shown tremendous potential in computer-aided drug discovery. However, the applicability of such deep neural networks is limited by the requirement of large amounts of training data. In order to cope with limited training data for a target task, transfer learning for SAR modeling has been recently adopted to leverage information from data of related tasks. In this work, in contrast to the popular parameter-based transfer learning such as pretraining, we develop novel deep transfer learning methods TAc and TAc-fc to leverage source domain data and transfer useful information to the target domain. TAc learns to generate effective molecular features that can generalize well from one domain to another and increase the classification performance in the target domain. Additionally, TAc-fc extends TAc by incorporating novel components to selectively learn feature-wise and compound-wise transferability. We used the bioassay screening data from PubChem and identified 120 pairs of bioassays such that the active compounds in each pair are more similar to each other compared to their inactive compounds. Overall, TAc achieves the best performance with an average ROC-AUC of 0.801; it significantly improves the ROC-AUC of 83% of target tasks with an average task-wise performance improvement of 7.102%, compared to the best baseline dmpna. Our experiments clearly demonstrate that TAc achieves significant improvement over all baselines across a large number of target tasks. Furthermore, although TAc-fc achieves slightly worse ROC-AUC on average compared to TAc (0.798 vs 0.801), TAc-fc still achieves the best performance on more tasks in terms of PR-AUC and F1 compared to other methods. In summary, TAc-fc is also found to be a strong model with competitive or even better performance than TAc on a notable number of target tasks.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA