Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
Eur J Hum Genet ; 32(4): 461-465, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38200084

RESUMO

From a network medicine perspective, a disease is the consequence of perturbations on the interactome. These perturbations tend to appear in a specific neighbourhood on the interactome, the disease module, and modules related to phenotypically similar diseases tend to be located in close-by regions. We present LanDis, a freely available web-based interactive tool ( https://paccanarolab.org/landis ) that allows domain experts, medical doctors and the larger scientific community to graphically navigate the interactome distances between the modules of over 44 million pairs of heritable diseases. The map-like interface provides detailed comparisons between pairs of diseases together with supporting evidence. Every disease in LanDis is linked to relevant entries in OMIM and UniProt, providing a starting point for in-depth analysis and an opportunity for novel insight into the aetiology of diseases as well as differential diagnosis.

2.
J Biomed Inform ; 139: 104295, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36716983

RESUMO

Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful for assessing associations between patients' predictors and outcomes of interest. However, these datasets often suffer from missing values in a high proportion of cases, whose removal may introduce severe bias. Several multiple imputation algorithms have been proposed to attempt to recover the missing information under an assumed missingness mechanism. Each algorithm presents strengths and weaknesses, and there is currently no consensus on which multiple imputation algorithm works best in a given scenario. Furthermore, the selection of each algorithm's parameters and data-related modeling choices are also both crucial and challenging. In this paper we propose a novel framework to numerically evaluate strategies for handling missing data in the context of statistical analysis, with a particular focus on multiple imputation techniques. We demonstrate the feasibility of our approach on a large cohort of type-2 diabetes patients provided by the National COVID Cohort Collaborative (N3C) Enclave, where we explored the influence of various patient characteristics on outcomes related to COVID-19. Our analysis included classic multiple imputation techniques as well as simple complete-case Inverse Probability Weighted models. Extensive experiments show that our approach can effectively highlight the most promising and performant missing-data handling strategy for our case study. Moreover, our methodology allowed a better understanding of the behavior of the different models and of how it changed as we modified their parameters. Our method is general and can be applied to different research fields and on datasets containing heterogeneous types.


Assuntos
COVID-19 , Humanos , Algoritmos , Projetos de Pesquisa , Viés , Probabilidade
3.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35679533

RESUMO

Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.


Assuntos
Algoritmos , Aprendizado de Máquina
4.
Viruses ; 14(2)2022 02 11.
Artigo em Inglês | MEDLINE | ID: mdl-35215969

RESUMO

Despite the development of specific therapies against severe acute respiratory coronavirus 2 (SARS-CoV-2), the continuous investigation of the mechanism of action of clinically approved drugs could provide new information on the druggable steps of virus-host interaction. For example, chloroquine (CQ)/hydroxychloroquine (HCQ) lacks in vitro activity against SARS-CoV-2 in TMPRSS2-expressing cells, such as human pneumocyte cell line Calu-3, and likewise, failed to show clinical benefit in the Solidarity and Recovery clinical trials. Another antimalarial drug, mefloquine, which is not a 4-aminoquinoline like CQ/HCQ, has emerged as a potential anti-SARS-CoV-2 antiviral in vitro and has also been previously repurposed for respiratory diseases. Here, we investigated the anti-SARS-CoV-2 mechanism of action of mefloquine in cells relevant for the physiopathology of COVID-19, such as Calu-3 cells (that recapitulate type II pneumocytes) and monocytes. Molecular pathways modulated by mefloquine were assessed by differential expression analysis, and confirmed by biological assays. A PBPK model was developed to assess mefloquine's optimal doses for achieving therapeutic concentrations. Mefloquine inhibited SARS-CoV-2 replication in Calu-3, with an EC50 of 1.2 µM and EC90 of 5.3 µM. It reduced SARS-CoV-2 RNA levels in monocytes and prevented virus-induced enhancement of IL-6 and TNF-α. Mefloquine reduced SARS-CoV-2 entry and synergized with Remdesivir. Mefloquine's pharmacological parameters are consistent with its plasma exposure in humans and its tissue-to-plasma predicted coefficient points suggesting that mefloquine may accumulate in the lungs. Altogether, our data indicate that mefloquine's chemical structure could represent an orally available host-acting agent to inhibit virus entry.


Assuntos
Células Epiteliais Alveolares/efeitos dos fármacos , Antivirais/farmacologia , Cloroquina/farmacologia , Mefloquina/farmacologia , SARS-CoV-2/efeitos dos fármacos , Monofosfato de Adenosina/análogos & derivados , Monofosfato de Adenosina/farmacologia , Alanina/análogos & derivados , Alanina/farmacologia , Células Epiteliais Alveolares/virologia , Linhagem Celular , Reposicionamento de Medicamentos/métodos , Humanos , Serina Endopeptidases/genética , Internalização do Vírus/efeitos dos fármacos , Tratamento Farmacológico da COVID-19
5.
Cell Rep Methods ; 2(12): 100358, 2022 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-36590692

RESUMO

Early and accurate detection of side effects is critical for the clinical success of drugs under development. Here, we aim to predict unknown side effects for drugs with a small number of side effects identified in randomized controlled clinical trials. Our machine learning framework, the geometric self-expressive model (GSEM), learns globally optimal self-representations for drugs and side effects from pharmacological graph networks. We show the usefulness of the GSEM on 505 therapeutically diverse drugs and 904 side effects from multiple human physiological systems. Here, we also show a data integration strategy that could be adopted to improve the ability of side effect prediction models to identify unknown side effects that might only appear after the drug enters the market.


Assuntos
Biologia Computacional , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Aprendizado de Máquina , Ensaios Clínicos Controlados Aleatórios como Assunto
6.
Patterns (N Y) ; 3(1): 100396, 2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-34778851

RESUMO

We present two machine learning approaches for drug repurposing. While we have developed them for COVID-19, they are disease-agnostic. The two methodologies are complementary, targeting SARS-CoV-2 and host factors, respectively. Our first approach consists of a matrix factorization algorithm to rank broad-spectrum antivirals. Our second approach, based on network medicine, uses graph kernels to rank drugs according to the perturbation they induce on a subnetwork of the human interactome that is crucial for SARS-CoV-2 infection/replication. Our experiments show that our top predicted broad-spectrum antivirals include drugs indicated for compassionate use in COVID-19 patients; and that the ranking obtained by our kernel-based approach aligns with experimental data. Finally, we present the COVID-19 repositioning explorer (CoREx), an interactive online tool to explore the interplay between drugs and SARS-CoV-2 host proteins in the context of biological networks, protein function, drug clinical use, and Connectivity Map. CoREx is freely available at: https://paccanarolab.org/corex/.

7.
Cell Rep ; 37(3): 109839, 2021 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-34624208

RESUMO

MicroRNAs (miRNAs) are small non-coding RNAs involved in post-transcriptional gene regulation that have a major impact on many diseases and provide an exciting avenue toward antiviral therapeutics. From patient transcriptomic data, we determined that a circulating miRNA, miR-2392, is directly involved with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) machinery during host infection. Specifically, we show that miR-2392 is key in driving downstream suppression of mitochondrial gene expression, increasing inflammation, glycolysis, and hypoxia, as well as promoting many symptoms associated with coronavirus disease 2019 (COVID-19) infection. We demonstrate that miR-2392 is present in the blood and urine of patients positive for COVID-19 but is not present in patients negative for COVID-19. These findings indicate the potential for developing a minimally invasive COVID-19 detection method. Lastly, using in vitro human and in vivo hamster models, we design a miRNA-based antiviral therapeutic that targets miR-2392, significantly reduces SARS-CoV-2 viability in hamsters, and may potentially inhibit a COVID-19 disease state in humans.


Assuntos
COVID-19/genética , COVID-19/imunologia , MicroRNAs/genética , SARS-CoV-2/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Animais , Antivirais/farmacologia , Biomarcadores/metabolismo , Cricetinae , Feminino , Furões , Regulação da Expressão Gênica , Glicólise , Voluntários Saudáveis , Humanos , Hipóxia , Inflamação , Masculino , Camundongos , Pessoa de Meia-Idade , Proteômica/métodos , Curva ROC , Ratos , Tratamento Farmacológico da COVID-19
8.
bioRxiv ; 2021 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-33948587

RESUMO

MicroRNAs (miRNAs) are small non-coding RNAs involved in post-transcriptional gene regulation that have a major impact on many diseases and provides an exciting avenue towards antiviral therapeutics. From patient transcriptomic data, we have discovered a circulating miRNA, miR-2392, that is directly involved with SARS-CoV-2 machinery during host infection. Specifically, we show that miR-2392 is key in driving downstream suppression of mitochondrial gene expression, increasing inflammation, glycolysis, and hypoxia as well as promoting many symptoms associated with COVID-19 infection. We demonstrate miR-2392 is present in the blood and urine of COVID-19 positive patients, but not detected in COVID-19 negative patients. These findings indicate the potential for developing a novel, minimally invasive, COVID-19 detection method. Lastly, using in vitro human and in vivo hamster models, we have developed a novel miRNA-based antiviral therapeutic that targets miR-2392, significantly reduces SARS-CoV-2 viability in hamsters and may potentially inhibit a COVID-19 disease state in humans.

10.
Nat Commun ; 11(1): 4575, 2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32917868

RESUMO

A central issue in drug risk-benefit assessment is identifying frequencies of side effects in humans. Currently, frequencies are experimentally determined in randomised controlled clinical trials. We present a machine learning framework for computationally predicting frequencies of drug side effects. Our matrix decomposition algorithm learns latent signatures of drugs and side effects that are both reproducible and biologically interpretable. We show the usefulness of our approach on 759 structurally and therapeutically diverse drugs and 994 side effects from all human physiological systems. Our approach can be applied to any drug for which a small number of side effect frequencies have been identified, in order to predict the frequencies of further, yet unidentified, side effects. We show that our model is informative of the biology underlying drug activity: individual components of the drug signatures are related to the distinct anatomical categories of the drugs and to the specific drug routes of administration.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Aprendizado de Máquina , Algoritmos , Biologia Computacional/métodos , Bases de Dados de Produtos Farmacêuticos , Humanos , Preparações Farmacêuticas/administração & dosagem , Probabilidade
11.
BMC Bioinformatics ; 21(1): 222, 2020 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-32471347

RESUMO

BACKGROUND: Genome-wide ligation-based assays such as Hi-C provide us with an unprecedented opportunity to investigate the spatial organization of the genome. Results of a typical Hi-C experiment are often summarized in a chromosomal contact map, a matrix whose elements reflect the co-location frequencies of genomic loci. To elucidate the complex structural and functional interactions between those genomic loci, networks offer a natural and powerful framework. RESULTS: We propose a novel graph-theoretical framework, the Corrected Gene Proximity (CGP) map to study the effect of the 3D spatial organization of genes in transcriptional regulation. The starting point of the CGP map is a weighted network, the gene proximity map, whose weights are based on the contact frequencies between genes extracted from genome-wide Hi-C data. We derive a null model for the network based on the signal contributed by the 1D genomic distance and use it to "correct" the gene proximity for cell type 3D specific arrangements. The CGP map, therefore, provides a network framework for the 3D structure of the genome on a global scale. On human cell lines, we show that the CGP map can detect and quantify gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies. Analyzing the expression pattern of metabolic pathways of two hematopoietic cell lines, we find that the relative positioning of the genes, as captured and quantified by the CGP, is highly correlated with their expression change. We further show that the CGP map can be used to form an inter-chromosomal proximity map that allows large-scale abnormalities, such as chromosomal translocations, to be identified. CONCLUSIONS: The Corrected Gene Proximity map is a map of the 3D structure of the genome on a global scale. It allows the simultaneous analysis of intra- and inter- chromosomal interactions and of gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies, thus revealing hidden associations between global spatial positioning and gene expression. The flexible graph-based formalism of the CGP map can be easily generalized to study any existing Hi-C datasets.


Assuntos
Cromossomos Humanos , Regulação da Expressão Gênica , Genoma Humano , Linhagem Celular , Genômica/métodos , Humanos , Redes e Vias Metabólicas/genética
12.
Mob DNA ; 11: 7, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32042315

RESUMO

BACKGROUND: Ligation-mediated PCR protocols have diverse uses including the identification of integration sites of insertional mutagens, integrating vectors and naturally occurring mobile genetic elements. For approaches that employ NGS sequencing, the relative abundance of integrations within a complex mixture is typically determined through the use of read counts or unique fragment lengths from a ligation of sheared DNA; however, these estimates may be skewed by PCR amplification biases and saturation of sequencing coverage. RESULTS: Here we describe a modification of our previous splinkerette based ligation-mediated PCR using a novel Illumina-compatible adapter design that prevents amplification of non-target DNA and incorporates unique molecular identifiers. This design reduces the number of PCR cycles required and improves relative quantitation of integration abundance for saturating sequencing coverage. By inverting the forked adapter strands from a standard orientation, the integration-genome junction can be sequenced without affecting the sequence diversity required for cluster generation on the flow cell. Replicate libraries of murine leukemia virus-infected spleen samples yielded highly reproducible quantitation of clonal integrations as well as a deep coverage of subclonal integrations. A dilution series of DNAs bearing integrations of MuLV or piggyBac transposon shows linearity of the quantitation over a range of concentrations. CONCLUSIONS: Merging ligation and library generation steps can reduce total PCR amplification cycles without sacrificing coverage or fidelity. The protocol is robust enough for use in a 96 well format using an automated liquid handler and we include programs for use of a Beckman Biomek liquid handling workstation. We also include an informatics pipeline that maps reads, builds integration contigs and quantitates integration abundance using both fragment lengths and unique molecular identifiers. Suggestions for optimizing the protocol to other target DNA sequences are included. The reproducible distinction of clonal and subclonal integration sites from each other allows for analysis of populations of cells undergoing selection, such as those found in insertional mutagenesis screens.

13.
Sci Rep ; 10(1): 3612, 2020 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-32107391

RESUMO

Methods for phenotype and outcome prediction are largely based on inductive supervised models that use selected biomarkers to make predictions, without explicitly considering the functional relationships between individuals. We introduce a novel network-based approach named Patient-Net (P-Net) in which biomolecular profiles of patients are modeled in a graph-structured space that represents gene expression relationships between patients. Then a kernel-based semi-supervised transductive algorithm is applied to the graph to explore the overall topology of the graph and to predict the phenotype/clinical outcome of patients. Experimental tests involving several publicly available datasets of patients afflicted with pancreatic, breast, colon and colorectal cancer show that our proposed method is competitive with state-of-the-art supervised and semi-supervised predictive systems. Importantly, P-Net also provides interpretable models that can be easily visualized to gain clues about the relationships between patients, and to formulate hypotheses about their stratification.


Assuntos
Neoplasias da Mama/diagnóstico , Neoplasias Colorretais/diagnóstico , Redes Reguladoras de Genes , Redes Neurais de Computação , Neoplasias Pancreáticas/diagnóstico , Algoritmos , Inteligência Artificial , Neoplasias da Mama/epidemiologia , Neoplasias Colorretais/epidemiologia , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Feminino , Humanos , Individualidade , Masculino , Neoplasias Pancreáticas/epidemiologia , Fenótipo , Prognóstico , Transcriptoma , Resultado do Tratamento
14.
PLoS Comput Biol ; 15(7): e1007078, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31276496

RESUMO

Network medicine approaches have been largely successful at increasing our knowledge of molecularly characterized diseases. Given a set of disease genes associated with a disease, neighbourhood-based methods and random walkers exploit the interactome allowing the prediction of further genes for that disease. In general, however, diseases with no known molecular basis constitute a challenge. Here we present a novel network approach to prioritize gene-disease associations that is able to also predict genes for diseases with no known molecular basis. Our method, which we have called Cardigan (ChARting DIsease Gene AssociatioNs), uses semi-supervised learning and exploits a measure of similarity between disease phenotypes. We evaluated its performance at predicting genes for both molecularly characterized and uncharacterized diseases in OMIM, using both weighted and binary interactomes, and compared it with state-of-the-art methods. Our tests, which use datasets collected at different points in time to replicate the dynamics of the disease gene discovery process, prove that Cardigan is able to accurately predict disease genes for molecularly uncharacterized diseases. Additionally, standard leave-one-out cross validation tests show how our approach outperforms state-of-the-art methods at predicting genes for molecularly characterized diseases by 14%-65%. Cardigan can also be used for disease module prediction, where it outperforms state-of-the-art methods by 87%-299%.


Assuntos
Doenças Genéticas Inatas/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Doenças Genéticas Inatas/diagnóstico , Humanos , Fenótipo
15.
Nat Commun ; 10(1): 1167, 2019 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-30842421

RESUMO

The original version of this Article contained an error in the hyperlink for the online repository http://mulvdb.org which was incorrectly given as http://mulv.lms.mrc.ac.uk. This has been corrected in both the PDF and HTML versions of the Article.

16.
Nat Commun ; 9(1): 2649, 2018 07 09.
Artigo em Inglês | MEDLINE | ID: mdl-29985390

RESUMO

Determining whether recurrent but rare cancer mutations are bona fide driver mutations remains a bottleneck in cancer research. Here we present the most comprehensive analysis of murine leukemia virus-driven lymphomagenesis produced to date, sequencing 700,000 mutations from >500 malignancies collected at time points throughout tumor development. This scale of data allows novel statistical approaches for identifying selected mutations and yields a high-resolution, genome-wide map of the selective forces surrounding cancer gene loci. We also demonstrate negative selection of mutations that may be deleterious to tumor development indicating novel avenues for therapy. Screening of two BCL2 transgenic models confirmed known drivers of human non-Hodgkin lymphoma, and implicates novel candidates including modifiers of immunosurveillance and MHC loci. Correlating mutations with genotypic and phenotypic features independently of local variance in mutation density also provides support for weakly evidenced cancer genes. An online resource http://mulv.lms.mrc.ac.uk allows customized queries of the entire dataset.


Assuntos
Loci Gênicos/genética , Predisposição Genética para Doença/genética , Linfoma/genética , Mutação , Animais , Estudos de Associação Genética , Estudo de Associação Genômica Ampla , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Vírus da Leucemia Murina/genética , Vírus da Leucemia Murina/fisiologia , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Mutagênese Insercional
17.
Genome Biol Evol ; 9(11): 3059-3072, 2017 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-29087523

RESUMO

Life-history transitions require major reprogramming at the behavioral and physiological level. Mating and reproductive maturation are known to trigger changes in gene transcription in reproductive tissues in a wide range of organisms, but we understand little about the molecular consequences of a failure to mate or become reproductively mature, and it is not clear to what extent these processes trigger neural as well as physiological changes. In this study, we examined the molecular processes underpinning the behavioral changes that accompany the major life-history transitions in a key pollinator, the bumblebee Bombus terrestris. We compared neuro-transcription in queens that succeeded or failed in switching from virgin and immature states, to mated and reproductively mature states. Both successes and failures were associated with distinct molecular profiles, illustrating how development during adulthood triggers distinct molecular profiles within a single caste of a eusocial insect. Failures in both mating and reproductive maturation were explained by a general up-regulation of brain gene transcription. We identified 21 genes that were highly connected in a gene coexpression network analysis: nine genes are involved in neural processes and four are regulators of gene expression. This suggests that negotiating life-history transitions involves significant neural processing and reprogramming, and not just changes in physiology. These findings provide novel insights into basic life-history transitions of an insect. Failure to mate or to become reproductively mature is an overlooked component of variation in natural systems, despite its prevalence in many sexually reproducing organisms, and deserves deeper investigation in the future.


Assuntos
Abelhas/genética , Regulação da Expressão Gênica , Reprodução/genética , Comportamento Sexual Animal , Animais , Abelhas/fisiologia , Encéfalo/metabolismo , Encéfalo/fisiologia , Biologia Computacional , Bases de Dados Genéticas , Feminino , Perfilação da Expressão Gênica , Redes Reguladoras de Genes/genética , Marcadores Genéticos , Proteínas de Insetos/genética , Proteínas de Insetos/metabolismo , Proteínas de Insetos/fisiologia , RNA/genética , Reprodução/fisiologia
18.
Hum Mutat ; 37(5): 447-56, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26841357

RESUMO

A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported.


Assuntos
Substituição de Aminoácidos , Neoplasias/genética , Proteoma/genética , Navegador , Algoritmos , Análise por Conglomerados , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Estrutura Terciária de Proteína , Proteoma/química
19.
Sci Rep ; 5: 17658, 2015 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-26631976

RESUMO

We introduce a MeSH-based method that accurately quantifies similarity between heritable diseases at molecular level. This method effectively brings together the existing information about diseases that is scattered across the vast corpus of biomedical literature. We prove that sets of MeSH terms provide a highly descriptive representation of heritable disease and that the structure of MeSH provides a natural way of combining individual MeSH vocabularies. We show that our measure can be used effectively in the prediction of candidate disease genes. We developed a web application to query more than 28.5 million relationships between 7,574 hereditary diseases (96% of OMIM) based on our similarity measure.


Assuntos
Doenças Genéticas Inatas , Medical Subject Headings , Mineração de Dados/métodos , Genes , Doenças Genéticas Inatas/genética , Humanos , Internet
20.
J R Soc Interface ; 11(99)2014 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-25142519

RESUMO

We aimed to test the proposal that progressive combinations of multiple promoter elements acting in concert may be responsible for the full range of phases observed in plant circadian output genes. In order to allow reliable selection of informative phase groupings of genes for our purpose, intrinsic cyclic patterns of expression were identified using a novel, non-biased method for the identification of circadian genes. Our non-biased approach identified two dominant, inherent orthogonal circadian trends underlying publicly available microarray data from plants maintained under constant conditions. Furthermore, these trends were highly conserved across several plant species. Four phase-specific modules of circadian genes were generated by projection onto these trends and, in order to identify potential combinatorial promoter elements that might classify genes into these groups, we used a Random Forest pipeline which merged data from multiple decision trees to look for the presence of element combinations. We identified a number of regulatory motifs which aggregated into coherent clusters capable of predicting the inclusion of genes within each phase module with very high fidelity and these motif combinations changed in a consistent, progressive manner from one phase module group to the next, providing strong support for our hypothesis.


Assuntos
Relógios Circadianos/genética , Regulação da Expressão Gênica de Plantas/fisiologia , Redes Reguladoras de Genes/genética , Genes de Plantas/genética , Fenômenos Fisiológicos Vegetais , Regiões Promotoras Genéticas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...