Search | VHL Regional Portal

1.

Node-degree aware edge sampling mitigates inflated classification performance in biomedical random walk-based graph representation learning.

Cappelletti, Luca; Rekerle, Lauren; Fontana, Tommaso; Hansen, Peter; Casiraghi, Elena; Ravanmehr, Vida; Mungall, Christopher J; Yang, Jeremy J; Spranger, Leonard; Karlebach, Guy; Caufield, J Harry; Carmody, Leigh; Coleman, Ben; Oprea, Tudor I; Reese, Justin; Valentini, Giorgio; Robinson, Peter N.

Bioinform Adv ; 4(1): vbae036, 2024.

Article in English | MEDLINE | ID: mdl-38577542

ABSTRACT

Motivation: Graph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes. For this reason, classification procedures are forced to assume that the vast majority of unlabeled edges are negative. Existing approaches to sampling negative edges for training and evaluating classifiers do so by uniformly sampling pairs of nodes. Results: We show here that this sampling strategy typically leads to sets of positive and negative examples with imbalanced node degree distributions. Using representative heterogeneous biomedical knowledge graph and random walk-based graph machine learning, we show that this strategy substantially impacts classification performance. If users of graph machine-learning models apply the models to prioritize examples that are drawn from approximately the same distribution as the positive examples are, then performance of models as estimated in the validation phase may be artificially inflated. We present a degree-aware node sampling approach that mitigates this effect and is simple to implement. Availability and implementation: Our code and data are publicly available at https://github.com/monarch-initiative/negativeExampleSelection.

2.

Computing Minimal Boolean Models of Gene Regulatory Networks.

Karlebach, Guy; Robinson, Peter N.

J Comput Biol ; 31(2): 117-127, 2024 Feb.

Article in English | MEDLINE | ID: mdl-37889991

ABSTRACT

Models of gene regulatory networks (GRNs) capture the dynamics of the regulatory processes that occur within the cell as a means to understanding the variability observed in gene expression between different conditions. Arguably the simplest mathematical construct used for modeling is the Boolean network, which dictates a set of logical rules for transition between states described as Boolean vectors. Due to the complexity of gene regulation and the limitations of experimental technologies, in most cases knowledge about regulatory interactions and Boolean states is partial. In addition, the logical rules themselves are not known a priori. Our goal in this work is to create an algorithm that finds the network that fits the data optimally, and identify the network states that correspond to the noise-free data. We present a novel methodology for integrating experimental data and performing a search for the optimal consistent structure via optimization of a linear objective function under a set of linear constraints. In addition, we extend our methodology into a heuristic that alleviates the computational complexity of the problem for datasets that are generated by single-cell RNA-Sequencing (scRNA-Seq). We demonstrate the effectiveness of these tools using simulated data, and in addition a publicly available scRNA-Seq dataset and the GRN that is associated with it. Our methodology will enable researchers to obtain a better understanding of the dynamics of GRNs and their biological role.

Subject(s)

Algorithms , Gene Regulatory Networks , Gene Expression Regulation

3.

Alternative splicing is coupled to gene expression in a subset of variably expressed genes.

Karlebach, Guy; Steinhaus, Robin; Danis, Daniel; Devoucoux, Maeva; Anczuków, Olga; Sheynkman, Gloria; Seelow, Dominik; Robinson, Peter N.

bioRxiv ; 2023 Oct 11.

Article in English | MEDLINE | ID: mdl-37398049

ABSTRACT

Numerous factors regulate alternative splicing of human genes at a co-transcriptional level. However, how alternative splicing depends on the regulation of gene expression is poorly understood. We leveraged data from the Genotype-Tissue Expression (GTEx) project to show a significant association of gene expression and splicing for 6874 (4.9%) of 141,043 exons in 1106 (13.3%) of 8314 genes with substantially variable expression in ten GTEx tissues. About half of these exons demonstrate higher inclusion with higher gene expression, and half demonstrate higher exclusion, with the observed direction of coupling being highly consistent across different tissues and in external datasets. The exons differ with respect to sequence characteristics, enriched sequence motifs, RNA polymerase II binding, and inferred transcription rate of downstream introns. The exons were enriched for hundreds of isoform-specific Gene Ontology annotations, suggesting that the coupling of expression and alternative splicing described here may provide an important gene regulatory mechanism that might be used in a variety of biological contexts. In particular, higher inclusion exons could play an important role during cell division.

4.

An expectation-maximization framework for comprehensive prediction of isoform-specific functions.

Karlebach, Guy; Carmody, Leigh; Sundaramurthi, Jagadish Chandrabose; Casiraghi, Elena; Hansen, Peter; Reese, Justin; Mungall, Christopher J; Valentini, Giorgio; Robinson, Peter N.

Bioinformatics ; 39(4)2023 04 03.

Article in English | MEDLINE | ID: mdl-36929917

ABSTRACT

MOTIVATION: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific gene ontology annotations. RESULTS: We present isoform interpretation, a method that uses expectation-maximization to infer isoform-specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85 617 isoforms of 17 900 protein-coding human genes spanning a range of 17 430 distinct gene ontology terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isoform interpretation significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isoform interpretation show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene-level function. AVAILABILITY AND IMPLEMENTATION: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321.

Subject(s)

Motivation , Software , Humans , Protein Isoforms/genetics , Alternative Splicing , Sequence Analysis, RNA

5.

Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes.

Reese, Justin T; Blau, Hannah; Casiraghi, Elena; Bergquist, Timothy; Loomba, Johanna J; Callahan, Tiffany J; Laraway, Bryan; Antonescu, Corneliu; Coleman, Ben; Gargano, Michael; Wilkins, Kenneth J; Cappelletti, Luca; Fontana, Tommaso; Ammar, Nariman; Antony, Blessy; Murali, T M; Caufield, J Harry; Karlebach, Guy; McMurry, Julie A; Williams, Andrew; Moffitt, Richard; Banerjee, Jineta; Solomonides, Anthony E; Davis, Hannah; Kostka, Kristin; Valentini, Giorgio; Sahner, David; Chute, Christopher G; Madlock-Brown, Charisse; Haendel, Melissa A; Robinson, Peter N.

EBioMedicine ; 87: 104413, 2023 Jan.

Article in English | MEDLINE | ID: mdl-36563487

ABSTRACT

BACKGROUND: Stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, long COVID is incompletely understood and characterised by a wide range of manifestations that are difficult to analyse computationally. Additionally, the generalisability of machine learning classification of COVID-19 clinical outcomes has rarely been tested. METHODS: We present a method for computationally modelling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning. FINDINGS: We found six clusters of PASC patients, each with distinct profiles of phenotypic abnormalities, including clusters with distinct pulmonary, neuropsychiatric, and cardiovascular abnormalities, and a cluster associated with broad, severe manifestations and increased mortality. There was significant association of cluster membership with a range of pre-existing conditions and measures of severity during acute COVID-19. We assigned new patients from other healthcare centres to clusters by maximum semantic similarity to the original patients, and showed that the clusters were generalisable across different hospital systems. The increased mortality rate originally identified in one cluster was consistently observed in patients assigned to that cluster in other hospital systems. INTERPRETATION: Semantic phenotypic clustering provides a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC. FUNDING: NIH (TR002306/OT2HL161847-01/OD011883/HG010860), U.S.D.O.E. (DE-AC02-05CH11231), Donald A. Roux Family Fund at Jackson Laboratory, Marsico Family at CU Anschutz.

Subject(s)

COVID-19 , Post-Acute COVID-19 Syndrome , Humans , Disease Progression , SARS-CoV-2

6.

Generalizable Long COVID Subtypes: Findings from the NIH N3C and RECOVER Programs.

Reese, Justin T; Blau, Hannah; Bergquist, Timothy; Loomba, Johanna J; Callahan, Tiffany; Laraway, Bryan; Antonescu, Corneliu; Casiraghi, Elena; Coleman, Ben; Gargano, Michael; Wilkins, Kenneth J; Cappelletti, Luca; Fontana, Tommaso; Ammar, Nariman; Antony, Blessy; Murali, T M; Karlebach, Guy; McMurry, Julie A; Williams, Andrew; Moffitt, Richard; Banerjee, Jineta; Solomonides, Anthony E; Davis, Hannah; Kostka, Kristin; Valentini, Giorgio; Sahner, David; Chute, Christopher G; Madlock-Brown, Charisse; Haendel, Melissa A; Robinson, Peter N.

medRxiv ; 2022 Jul 20.

Article in English | MEDLINE | ID: mdl-35665012

ABSTRACT

Accurate stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, the natural history of long COVID is incompletely understood and characterized by an extremely wide range of manifestations that are difficult to analyze computationally. In addition, the generalizability of machine learning classification of COVID-19 clinical outcomes has rarely been tested. We present a method for computationally modeling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning procedures. Using k-means clustering of this similarity matrix, we found six distinct clusters of PASC patients, each with distinct profiles of phenotypic abnormalities. There was a significant association of cluster membership with a range of pre-existing conditions and with measures of severity during acute COVID-19. Two of the clusters were associated with severe manifestations and displayed increased mortality. We assigned new patients from other healthcare centers to one of the six clusters on the basis of maximum semantic similarity to the original patients. We show that the identified clusters were generalizable across different hospital systems and that the increased mortality rate was consistently observed in two of the clusters. Semantic phenotypic clustering can provide a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC.

7.

NSAID use and clinical outcomes in COVID-19 patients: a 38-center retrospective cohort study.

Reese, Justin T; Coleman, Ben; Chan, Lauren; Blau, Hannah; Callahan, Tiffany J; Cappelletti, Luca; Fontana, Tommaso; Bradwell, Katie R; Harris, Nomi L; Casiraghi, Elena; Valentini, Giorgio; Karlebach, Guy; Deer, Rachel; McMurry, Julie A; Haendel, Melissa A; Chute, Christopher G; Pfaff, Emily; Moffitt, Richard; Spratt, Heidi; Singh, Jasvinder A; Mungall, Christopher J; Williams, Andrew E; Robinson, Peter N.

Virol J ; 19(1): 84, 2022 05 15.

Article in English | MEDLINE | ID: mdl-35570298

ABSTRACT

BACKGROUND: Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of the COVID-19 pandemic in 2020 suggested that ibuprofen was associated with an increased risk of adverse events in COVID-19 patients, but subsequent observational studies failed to demonstrate increased risk and in one case showed reduced risk associated with NSAID use. METHODS: A 38-center retrospective cohort study was performed that leveraged the harmonized, high-granularity electronic health record data of the National COVID Cohort Collaborative. A propensity-matched cohort of 19,746 COVID-19 inpatients was constructed by matching cases (treated with NSAIDs at the time of admission) and 19,746 controls (not treated) from 857,061 patients with COVID-19 available for analysis. The primary outcome of interest was COVID-19 severity in hospitalized patients, which was classified as: moderate, severe, or mortality/hospice. Secondary outcomes were acute kidney injury (AKI), extracorporeal membrane oxygenation (ECMO), invasive ventilation, and all-cause mortality at any time following COVID-19 diagnosis. RESULTS: Logistic regression showed that NSAID use was not associated with increased COVID-19 severity (OR: 0.57 95% CI: 0.53-0.61). Analysis of secondary outcomes using logistic regression showed that NSAID use was not associated with increased risk of all-cause mortality (OR 0.51 95% CI: 0.47-0.56), invasive ventilation (OR: 0.59 95% CI: 0.55-0.64), AKI (OR: 0.67 95% CI: 0.63-0.72), or ECMO (OR: 0.51 95% CI: 0.36-0.7). In contrast, the odds ratios indicate reduced risk of these outcomes, but our quantitative bias analysis showed E-values of between 1.9 and 3.3 for these associations, indicating that comparatively weak or moderate confounder associations could explain away the observed associations. CONCLUSIONS: Study interpretation is limited by the observational design. Recording of NSAID use may have been incomplete. Our study demonstrates that NSAID use is not associated with increased COVID-19 severity, all-cause mortality, invasive ventilation, AKI, or ECMO in COVID-19 inpatients. A conservative interpretation in light of the quantitative bias analysis is that there is no evidence that NSAID use is associated with risk of increased severity or the other measured outcomes. Our results confirm and extend analogous findings in previous observational studies using a large cohort of patients drawn from 38 centers in a nationally representative multicenter database.

Subject(s)

Acute Kidney Injury , COVID-19 , Anti-Inflammatory Agents, Non-Steroidal/adverse effects , COVID-19 Testing , Cohort Studies , Humans , Pandemics , Retrospective Studies

8.

Betacoronavirus-specific alternate splicing.

Karlebach, Guy; Aronow, Bruce; Baylin, Stephen B; Butler, Daniel; Foox, Jonathan; Levy, Shawn; Meydan, Cem; Mozsary, Christopher; Saravia-Butler, Amanda M; Taylor, Deanne M; Wurtele, Eve; Mason, Christopher E; Beheshti, Afshin; Robinson, Peter N.

Genomics ; 114(2): 110270, 2022 03.

Article in English | MEDLINE | ID: mdl-35074468

ABSTRACT

Viruses can subvert a number of cellular processes including splicing in order to block innate antiviral responses, and many viruses interact with cellular splicing machinery. SARS-CoV-2 infection was shown to suppress global mRNA splicing, and at least 10 SARS-CoV-2 proteins bind specifically to one or more human RNAs. Here, we investigate 17 published experimental and clinical datasets related to SARS-CoV-2 infection, datasets from the betacoronaviruses SARS-CoV and MERS, as well as Streptococcus pneumonia, HCV, Zika virus, Dengue virus, influenza H3N2, and RSV. We show that genes showing differential alternative splicing in SARS-CoV-2 have a similar functional profile to those of SARS-CoV and MERS and affect a diverse set of genes and biological functions, including many closely related to virus biology. Additionally, the differentially spliced transcripts of cells infected by coronaviruses were more likely to undergo intron-retention, contain a pseudouridine modification, and have a smaller number of exons as compared with differentially spliced transcripts in the control groups. Viral load in clinical COVID-19 samples was correlated with isoform distribution of differentially spliced genes. A significantly higher number of ribosomal genes are affected by differential alternative splicing and gene expression in betacoronavirus samples, and the betacoronavirus differentially spliced genes are depleted for binding sites of RNA-binding proteins. Our results demonstrate characteristic patterns of differential splicing in cells infected by SARS-CoV-2, SARS-CoV, and MERS. The alternative splicing changes observed in betacoronaviruses infection potentially modify a broad range of cellular functions, via changes in the functions of the products of a diverse set of genes involved in different biological processes.

Subject(s)

COVID-19 , Influenza, Human , Zika Virus Infection , Zika Virus , Alternative Splicing , COVID-19/genetics , Humans , Influenza A Virus, H3N2 Subtype , SARS-CoV-2/genetics , Zika Virus/genetics

9.

Betacoronavirus-specific alternate splicing.

Karlebach, Guy; Aronow, Bruce; Baylin, Stephen B; Butler, Daniel; Foox, Jonathan; Levy, Shawn; Meydan, Cem; Mozsary, Christopher; Saravia-Butler, Amanda M; Taylor, Deanne M; Wurtele, Eve; Mason, Christopher E; Beheshti, Afshin; Robinson, Peter N.

bioRxiv ; 2021 Jul 02.

Article in English | MEDLINE | ID: mdl-34230929

ABSTRACT

Viruses can subvert a number of cellular processes in order to block innate antiviral responses, and many viruses interact with cellular splicing machinery. SARS-CoV-2 infection was shown to suppress global mRNA splicing, and at least 10 SARS-CoV-2 proteins bind specifically to one or more human RNAs. Here, we investigate 17 published experimental and clinical datasets related to SARS-CoV-2 infection as well as datasets from the betacoronaviruses SARS-CoV and MERS as well as Streptococcus pneumonia, HCV, Zika virus, Dengue virus, influenza H3N2, and RSV. We show that genes showing differential alternative splicing in SARS-CoV-2 have a similar functional profile to those of SARS-CoV and MERS and affect a diverse set of genes and biological functions, including many closely related to virus biology. Additionally, the differentially spliced transcripts of cells infected by coronaviruses were more likely to undergo intron-retention, contain a pseudouridine modification and a smaller number of exons than differentially spliced transcripts in the control groups. Viral load in clinical COVID-19 samples was correlated with isoform distribution of differentially spliced genes. A significantly higher number of ribosomal genes are affected by DAS and DGE in betacoronavirus samples, and the betacoronavirus differentially spliced genes are depleted for binding sites of RNA-binding proteins. Our results demonstrate characteristic patterns of differential splicing in cells infected by SARS-CoV-2, SARS-CoV, and MERS, potentially modifying a broad range of cellular functions and affecting a diverse set of genes and biological functions.

10.

NSAID use and clinical outcomes in COVID-19 patients: A 38-center retrospective cohort study.

Reese, Justin T; Coleman, Ben; Chan, Lauren; Blau, Hannah; Callahan, Tiffany J; Cappelletti, Luca; Fontana, Tommaso; Bradwell, Katie Rebecca; Harris, Nomi L; Casiraghi, Elena; Valentini, Giorgio; Karlebach, Guy; Deer, Rachel; McMurry, Julie A; Haendel, Melissa A; Chute, Christopher G; Pfaff, Emily; Moffitt, Richard; Spratt, Heidi; Singh, Jasvinder; Mungall, Christopher J; Williams, Andrew E; Robinson, Peter N.

medRxiv ; 2021 Dec 22.

Article in English | MEDLINE | ID: mdl-33907758

ABSTRACT

BACKGROUND: Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of the COVID-19 pandemic in 2020 suggested that ibuprofen was associated with an increased risk of adverse events in COVID-19 patients, but subsequent observational studies failed to demonstrate increased risk and in one case showed reduced risk associated with NSAID use. METHODS: A 38-center retrospective cohort study was performed that leveraged the harmonized, high-granularity electronic health record data of the National COVID Cohort Collaborative. A propensity-matched cohort of COVID-19 inpatients was constructed by matching cases (treated with NSAIDs) and controls (not treated) from 857,061 patients with COVID-19. The primary outcome of interest was COVID-19 severity in hospitalized patients, which was classified as: moderate, severe, or mortality/hospice. Secondary outcomes were acute kidney injury (AKI), extracorporeal membrane oxygenation (ECMO), invasive ventilation, and all-cause mortality at any time following COVID-19 diagnosis. RESULTS: Logistic regression showed that NSAID use was not associated with increased COVID-19 severity (OR: 0.57 95% CI: 0.53-0.61). Analysis of secondary outcomes using logistic regression showed that NSAID use was not associated with increased risk of all-cause mortality (OR 0.51 95% CI: 0.47-0.56), invasive ventilation (OR: 0.59 95% CI: 0.55-0.64), AKI (OR: 0.67 95% CI: 0.63-0.72), or ECMO (OR: 0.51 95% CI: 0.36-0.7). In contrast, the odds ratios indicate reduced risk of these outcomes, but our quantitative bias analysis showed E-values of between 1.9 and 3.3 for these associations, indicating that comparatively weak or moderate confounder associations could explain away the observed associations. CONCLUSIONS: Study interpretation is limited by the observational design. Recording of NSAID use may have been incomplete. Our study demonstrates that NSAID use is not associated with increased COVID-19 severity, all-cause mortality, invasive ventilation, AKI, or ECMO in COVID-19 inpatients. A conservative interpretation in light of the quantitative bias analysis is that there is no evidence that NSAID use is associated with risk of increased severity or the other measured outcomes. Our findings are the largest EHR-based analysis of the effect of NSAIDs on outcome in COVID-19 patients to date. Our results confirm and extend analogous findings in previous observational studies using a large cohort of patients drawn from 38 centers in a nationally representative multicenter database.

11.

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm.

Robinson, Peter N; Ravanmehr, Vida; Jacobsen, Julius O B; Danis, Daniel; Zhang, Xingmin Aaron; Carmody, Leigh C; Gargano, Michael A; Thaxton, Courtney L; Karlebach, Guy; Reese, Justin; Holtgrewe, Manuel; Köhler, Sebastian; McMurry, Julie A; Haendel, Melissa A; Smedley, Damian.

Am J Hum Genet ; 107(3): 403-417, 2020 09 03.

Article in English | MEDLINE | ID: mdl-32755546

ABSTRACT

Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.

Subject(s)

Computational Biology , Databases, Genetic , Genomics , Rare Diseases/diagnosis , Algorithms , Exome/genetics , Humans , Phenotype , Rare Diseases/genetics , Software

12.

HBA-DEALS: accurate and simultaneous identification of differential expression and splicing using hierarchical Bayesian analysis.

Karlebach, Guy; Hansen, Peter; Veiga, Diogo Ft; Steinhaus, Robin; Danis, Daniel; Li, Sheng; Anczukow, Olga; Robinson, Peter N.

Genome Biol ; 21(1): 171, 2020 07 13.

Article in English | MEDLINE | ID: mdl-32660516

ABSTRACT

We present Hierarchical Bayesian Analysis of Differential Expression and ALternative Splicing (HBA-DEALS), which simultaneously characterizes differential expression and splicing in cohorts. HBA-DEALS attains state of the art or better performance for both expression and splicing and allows genes to be characterized as having differential gene expression, differential alternative splicing, both, or neither. HBA-DEALS analysis of GTEx data demonstrated sets of genes that show predominant DGE or DAST across multiple tissue types. These sets have pervasive differences with respect to gene structure, function, membership in protein complexes, and promoter architecture.

Subject(s)

Alternative Splicing , Gene Expression , Models, Biological , Sequence Analysis, RNA , Software , Bayes Theorem

13.

Computational Processing and Quality Control of Hi-C, Capture Hi-C and Capture-C Data.

Hansen, Peter; Gargano, Michael; Hecht, Jochen; Ibn-Salem, Jonas; Karlebach, Guy; Roehr, Johannes T; Robinson, Peter N.

Genes (Basel) ; 10(7)2019 07 18.

Article in English | MEDLINE | ID: mdl-31323892

ABSTRACT

Hi-C, capture Hi-C (CHC) and Capture-C have contributed greatly to our present understanding of the three-dimensional organization of genomes in the context of transcriptional regulation by characterizing the roles of topological associated domains, enhancer promoter loops and other three-dimensional genomic interactions. The analysis is based on counts of chimeric read pairs that map to interacting regions of the genome. However, the processing and quality control presents a number of unique challenges. We review here the experimental and computational foundations and explain how the characteristics of restriction digests, sonication fragments and read pairs can be exploited to distinguish technical artefacts from valid read pairs originating from true chromatin interactions.

Subject(s)

Chromatin/genetics , Computational Biology , Genome , Genomics , Chromosome Mapping , Computational Biology/methods , Databases, Genetic , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Quality Control

14.

Left-Right Asymmetry of Maturation Rates in Human Embryonic Neural Development.

de Kovel, Carolien G F; Lisgo, Steven; Karlebach, Guy; Ju, Jia; Cheng, Gang; Fisher, Simon E; Francks, Clyde.

Biol Psychiatry ; 82(3): 204-212, 2017 08 01.

Article in English | MEDLINE | ID: mdl-28267988

ABSTRACT

BACKGROUND: Left-right asymmetry is a fundamental organizing feature of the human brain, and neuropsychiatric disorders such as schizophrenia sometimes involve alterations of brain asymmetry. As early as 8 weeks postconception, the majority of human fetuses move their right arms more than their left arms, but because nerve fiber tracts are still descending from the forebrain at this stage, spinal-muscular asymmetries are likely to play an important developmental role. METHODS: We used RNA sequencing to measure gene expression levels in the left and right spinal cords, and the left and right hindbrains, of 18 postmortem human embryos aged 4 to 8 weeks postconception. Genes showing embryonic lateralization were tested for an enrichment of signals in genome-wide association data for schizophrenia. RESULTS: The left side of the embryonic spinal cord was found to mature faster than the right side. Both sides transitioned from transcriptional profiles associated with cell division and proliferation at earlier stages to neuronal differentiation and function at later stages, but the two sides were not in synchrony (p = 2.2 E-161). The hindbrain showed a left-right mirrored pattern compared with the spinal cord, consistent with the well-known crossing over of function between these two structures. Genes that showed lateralization in the embryonic spinal cord were enriched for association signals with schizophrenia (p = 4.3 E-05). CONCLUSIONS: These are the earliest stage left-right differences of human neural development ever reported. Disruption of the lateralized developmental program may play a role in the genetic susceptibility to schizophrenia.

Subject(s)

Rhombencephalon/embryology , Rhombencephalon/metabolism , Spinal Cord/embryology , Spinal Cord/metabolism , Functional Laterality , Gene Expression Profiling , Gene Expression Regulation, Developmental , Genome-Wide Association Study , Humans , Rhombencephalon/pathology , Schizophrenia/genetics , Schizophrenia/metabolism , Spinal Cord/pathology

15.

Lateralization of gene expression in human language cortex.

Karlebach, Guy; Francks, Clyde.

Cortex ; 67: 30-6, 2015 Jun.

Article in English | MEDLINE | ID: mdl-25863470

ABSTRACT

Lateralization is an important aspect of the functional brain architecture for language and other cognitive faculties. The molecular genetic basis of human brain lateralization is unknown, and recent studies have suggested that gene expression in the cerebral cortex is bilaterally symmetrical. Here we have re-analyzed two transcriptomic datasets derived from post mortem human cerebral cortex, with a specific focus on superior temporal and auditory language cortex in adults. We applied an empirical Bayes approach to model differential left-right expression, together with gene ontology (GO) analysis and meta-analysis. There was robust and reproducible lateralization of individual genes and GO groups that are likely to fine-tune the electrophysiological and neurotransmission properties of cortical circuits, most notably synaptic transmission, nervous system development and glutamate receptor activity. Our findings anchor the cerebral biology of language to the molecular genetic level. Future research in model systems may determine how these molecular signatures of neurophysiological lateralization effect fine-tuning of cerebral cortical function, differently in the two hemispheres.

Subject(s)

Auditory Cortex/metabolism , Functional Laterality/genetics , Language , RNA, Messenger/metabolism , Temporal Lobe/metabolism , Transcriptome , Adolescent , Adult , Bayes Theorem , Cerebral Cortex/metabolism , Gene Expression , Humans , Male , Middle Aged , Young Adult

16.

Inferring Boolean network states from partial information.

Karlebach, Guy.

EURASIP J Bioinform Syst Biol ; 2013(1): 11, 2013 Sep 05.

Article in English | MEDLINE | ID: mdl-24006954

ABSTRACT

Networks of molecular interactions regulate key processes in living cells. Therefore, understanding their functionality is a high priority in advancing biological knowledge. Boolean networks are often used to describe cellular networks mathematically and are fitted to experimental datasets. The fitting often results in ambiguities since the interpretation of the measurements is not straightforward and since the data contain noise. In order to facilitate a more reliable mapping between datasets and Boolean networks, we develop an algorithm that infers network trajectories from a dataset distorted by noise. We analyze our algorithm theoretically and demonstrate its accuracy using simulation and microarray expression data.

17.

Constructing logical models of gene regulatory networks by integrating transcription factor-DNA interactions with expression data: an entropy-based approach.

Karlebach, Guy; Shamir, Ron.

J Comput Biol ; 19(1): 30-41, 2012 Jan.

Article in English | MEDLINE | ID: mdl-22216865

ABSTRACT

Models of gene regulatory networks (GRNs) attempt to explain the complex processes that determine cells' behavior, such as differentiation, metabolism, and the cell cycle. The advent of high-throughput data generation technologies has allowed researchers to fit theoretical models to experimental data on gene-expression profiles. GRNs are often represented using logical models. These models require that real-valued measurements be converted to discrete levels, such as on/off, but the discretization often introduces inconsistencies into the data. Dimitrova et al. posed the problem of efficiently finding a parsimonious resolution of the introduced inconsistencies. We show that reconstruction of a logical GRN that minimizes the errors is NP-complete, so that an efficient exact algorithm for the problem is not likely to exist. We present a probabilistic formulation of the problem that circumvents discretization of expression data. We phrase the problem of error reduction as a minimum entropy problem, develop a heuristic algorithm for it, and evaluate its performance on mouse embryonic stem cell data. The constructed model displays high consistency with prior biological knowledge. Despite the oversimplification of a discrete model, we show that it is superior to raw experimental measurements and demonstrates a highly significant level of identical regulatory logic among co-regulated genes. A software implementing the method is freely available at: http://acgt.cs.tau.ac.il/modent.

Subject(s)

Computational Biology/methods , DNA/genetics , Gene Regulatory Networks/genetics , Models, Genetic , Oligonucleotide Array Sequence Analysis/methods , Transcription Factors/genetics , Algorithms , Animals , Computer Simulation , DNA/metabolism , Entropy , Gene Expression Profiling/methods , Mice , Models, Statistical , Software , Transcription Factors/metabolism

18.

Minimally perturbing a gene regulatory network to avoid a disease phenotype: the glioma network as a test case.

Karlebach, Guy; Shamir, Ron.

BMC Syst Biol ; 4: 15, 2010 Feb 25.

Article in English | MEDLINE | ID: mdl-20184733

ABSTRACT

BACKGROUND: Mathematical modeling of biological networks is an essential part of Systems Biology. Developing and using such models in order to understand gene regulatory networks is a major challenge. RESULTS: We present an algorithm that determines the smallest perturbations required for manipulating the dynamics of a network formulated as a Petri net, in order to cause or avoid a specified phenotype. By modifying McMillan's unfolding algorithm, we handle partial knowledge and reduce computation cost. The methodology is demonstrated on a glioma network. Out of the single gene perturbations, activation of glutathione S-transferase P (GSTP1) gene was by far the most effective in blocking the cancer phenotype. Among pairs of perturbations, NFkB and TGF-beta had the largest joint effect, in accordance with their role in the EMT process. CONCLUSION: Our method allows perturbation analysis of regulatory networks and can overcome incomplete information. It can help in identifying drug targets and in prioritizing perturbation experiments.

Subject(s)

Brain Neoplasms/genetics , Gene Expression Regulation, Neoplastic/genetics , Glioma/genetics , Models, Genetic , Neoplasm Proteins/genetics , Phenotype , Signal Transduction/genetics , Algorithms , Animals , Computer Simulation , Genetic Engineering/methods , Genetic Predisposition to Disease/genetics , Humans

19.

Modelling and analysis of gene regulatory networks.

Karlebach, Guy; Shamir, Ron.

Nat Rev Mol Cell Biol ; 9(10): 770-80, 2008 Oct.

Article in English | MEDLINE | ID: mdl-18797474

ABSTRACT

Gene regulatory networks have an important role in every process of life, including cell differentiation, metabolism, the cell cycle and signal transduction. By understanding the dynamics of these networks we can shed light on the mechanisms of diseases that occur when these cellular processes are dysregulated. Accurate prediction of the behaviour of regulatory networks will also speed up biotechnological projects, as such predictions are quicker and cheaper than lab experiments. Computational methods, both for supporting the development of network models and for the analysis of their functionality, have already proved to be a valuable research tool.

Subject(s)

Gene Regulatory Networks , Models, Genetic , Algorithms , Animals , Bacteriophage lambda/genetics , Bacteriophage lambda/physiology , Humans , Linear Models , Mathematics , Models, Biological , Models, Statistical , Stochastic Processes , Transcription Factors/genetics , Transcription Factors/metabolism

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL