RESUMEN
How disease-associated mutations impair protein activities in the context of biological networks remains mostly undetermined. Although a few renowned alleles are well characterized, functional information is missing for over 100,000 disease-associated variants. Here we functionally profile several thousand missense mutations across a spectrum of Mendelian disorders using various interaction assays. The majority of disease-associated alleles exhibit wild-type chaperone binding profiles, suggesting they preserve protein folding or stability. While common variants from healthy individuals rarely affect interactions, two-thirds of disease-associated alleles perturb protein-protein interactions, with half corresponding to "edgetic" alleles affecting only a subset of interactions while leaving most other interactions unperturbed. With transcription factors, many alleles that leave protein-protein interactions intact affect DNA binding. Different mutations in the same gene leading to different interaction profiles often result in distinct disease phenotypes. Thus disease-associated alleles that perturb distinct protein activities rather than grossly affecting folding and stability are relatively widespread.
Asunto(s)
Enfermedad/genética , Mutación Missense , Mapas de Interacción de Proteínas , Proteínas/genética , Proteínas/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Estudio de Asociación del Genoma Completo , Humanos , Sistemas de Lectura Abierta , Pliegue de Proteína , Estabilidad ProteicaRESUMEN
Complex biological systems and cellular networks may underlie most genotype to phenotype relationships. Here, we review basic concepts in network biology, discussing different types of interactome networks and the insights that can come from analyzing them. We elaborate on why interactome networks are important to consider in biology, how they can be mapped and integrated with each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease.
Asunto(s)
Enfermedad/genética , Redes y Vías Metabólicas , Proteínas/metabolismo , Redes Reguladoras de Genes , Humanos , Mapeo de Interacción de Proteínas , Biología de SistemasRESUMEN
Network medicine has improved the mechanistic understanding of disease, offering quantitative insights into disease mechanisms, comorbidities, and novel diagnostic tools and therapeutic treatments. Yet, most network-based approaches rely on a comprehensive map of protein-protein interactions (PPI), ignoring interactions mediated by noncoding RNAs (ncRNAs). Here, we systematically combine experimentally confirmed binding interactions mediated by ncRNA with PPI, constructing a comprehensive network of all physical interactions in the human cell. We find that the inclusion of ncRNA expands the number of genes in the interactome by 46% and the number of interactions by 107%, significantly enhancing our ability to identify disease modules. Indeed, we find that 132 diseases lacked a statistically significant disease module in the protein-based interactome but have a statistically significant disease module after inclusion of ncRNA-mediated interactions, making these diseases accessible to the tools of network medicine. We show that the inclusion of ncRNAs helps unveil disease-disease relationships that were not detectable before and expands our ability to predict comorbidity patterns between diseases. Taken together, we find that including noncoding interactions improves both the breath and the predictive accuracy of network medicine.
Asunto(s)
MicroARNs , ARN Largo no Codificante , Humanos , ARN no Traducido/genética , ARN no Traducido/metabolismo , Comorbilidad , ARN Largo no Codificante/genética , MicroARNs/genéticaRESUMEN
The impact of a scientific publication is often measured by the number of citations it receives from the scientific community. However, citation count is susceptible to well-documented variations in citation practices across time and discipline, limiting our ability to compare different scientific achievements. Previous efforts to account for citation variations often rely on a priori discipline labels of papers, assuming that all papers in a discipline are identical in their subject matter. Here, we propose a network-based methodology to quantify the impact of an article by comparing it with locally comparable research, thereby eliminating the discipline label requirement. We show that the developed measure is not susceptible to discipline bias and follows a universal distribution for all articles published in different years, offering an unbiased indicator for impact across time and discipline. We then use the indicator to identify science-wide high impact research in the past half century and quantify its temporal production dynamics across disciplines, helping us identifying breakthroughs from diverse, smaller disciplines, such as geosciences, radiology, and optics, as opposed to citation-rich biomedical sciences. Our work provides insights into the evolution of science and paves a way for fair comparisons of the impact of diverse contributions across many fields.
Asunto(s)
Bibliometría , Factor de Impacto de la Revista , Sesgo , LogroRESUMEN
MOTIVATION: A major hindrance towards using Machine Learning (ML) on medical datasets is the discrepancy between a large number of variables and small sample sizes. While multiple feature selection techniques have been proposed to avoid the resulting overfitting, overall ensemble techniques offer the best selection robustness. Yet, current methods designed to combine different algorithms generally fail to leverage the dependencies identified by their components. Here, we propose Graphical Ensembling (GE), a graph-theory-based ensemble feature selection technique designed to improve the stability and relevance of the selected features. RESULTS: Relying on four datasets, we show that GE increases classification performance with fewer selected features. For example, on rheumatoid arthritis patient stratification, GE outperforms the baseline methods by 9% Balanced Accuracy while relying on fewer features. We use data on sub-cellular networks to show that the selected features (proteins) are closer to the known disease genes, and the uncovered biological mechanisms are more diversified. By successfully tackling the complex correlations between biological variables, we anticipate that GE will improve the medical applications of ML. AVAILABILITY AND IMPLEMENTATION: https://github.com/ebattistella/auto_machine_learning.
Asunto(s)
Algoritmos , Aprendizaje Automático , Humanos , Artritis Reumatoide , Biología Computacional/métodosRESUMEN
Diet, a modifiable risk factor, plays a pivotal role in most diseases, from cardiovascular disease to type 2 diabetes mellitus, cancer, and obesity. However, our understanding of the mechanistic role of the chemical compounds found in food remains incomplete. In this review, we explore the "dark matter" of nutrition, going beyond the macro- and micronutrients documented by national databases to unveil the exceptional chemical diversity of food composition. We also discuss the need to explore the impact of each compound in the presence of associated chemicals and relevant food sources and describe the tools that will allow us to do so. Finally, we discuss the role of network medicine in understanding the mechanism of action of each food molecule. Overall, we illustrate the important role of network science and artificial intelligence in our ability to reveal nutrition's multifaceted role in health and disease.
Asunto(s)
Dieta , Humanos , Alimentos , Inteligencia ArtificialRESUMEN
In this Letter, in Fig. 3c and f the Saccharomyces cerevisiae and Escherichia coli networks were subject to both weight loss and node deletion, a combination of two types of perturbation, as opposed to weight loss only (as the labelling incorrectly indicated). The collapse in Fig. 3h was also obtained from this combined perturbation, and therefore the results displayed in Fig. 3h remain fully consistent with the theoretical framework presented in this Letter. Figure 1 to this Amendment shows the corrected Fig. 3c, f and h, in which Fig. 3c and f have been generated with weight-loss perturbations only, as originally reported, together with the originally published panels, for completeness and transparency. The codes used to generate the original and the corrected Fig. 3 are available at https://github.com/jianxigao/NuRsE . We thank Travis A. Gibson for alerting us to this error. The original Letter has not been corrected.
RESUMEN
The brain is a complex system comprising a myriad of interacting neurons, posing significant challenges in understanding its structure, function, and dynamics. Network science has emerged as a powerful tool for studying such interconnected systems, offering a framework for integrating multiscale data and complexity. To date, network methods have significantly advanced functional imaging studies of the human brain and have facilitated the development of control theory-based applications for directing brain activity. Here, we discuss emerging frontiers for network neuroscience in the brain atlas era, addressing the challenges and opportunities in integrating multiple data streams for understanding the neural transitions from development to healthy function to disease. We underscore the importance of fostering interdisciplinary opportunities through workshops, conferences, and funding initiatives, such as supporting students and postdoctoral fellows with interests in both disciplines. By bringing together the network science and neuroscience communities, we can develop novel network-based methods tailored to neural circuits, paving the way toward a deeper understanding of the brain and its functions, as well as offering new challenges for network science.
Asunto(s)
Neurociencias , Humanos , Encéfalo , Impulso (Psicología) , Neuronas , InvestigadoresRESUMEN
The links of a physical network cannot cross, which often forces the network layout into nonoptimal entangled states. Here we define a network fabric as a two-dimensional projection of a network and propose the average crossing number as a measure of network entanglement. We analytically derive the dependence of the average crossing number on network density, average link length, degree heterogeneity, and community structure and show that the predictions accurately estimate the entanglement of both network models and of real physical networks.
RESUMEN
In many physical networks, including neurons in the brain1,2, three-dimensional integrated circuits3 and underground hyphal networks4, the nodes and links are physical objects that cannot intersect or overlap with each other. To take this into account, non-crossing conditions can be imposed to constrain the geometry of networks, which consequently affects how they form, evolve and function. However, these constraints are not included in the theoretical frameworks that are currently used to characterize real networks5-7. Most tools for laying out networks are variants of the force-directed layout algorithm8,9-which assumes dimensionless nodes and links-and are therefore unable to reveal the geometry of densely packed physical networks. Here we develop a modelling framework that accounts for the physical sizes of nodes and links, allowing us to explore how non-crossing conditions affect the geometry of a network. For small link thicknesses, we observe a weakly interacting regime in which link crossings are avoided via local link rearrangements, without altering the overall geometry of the layout compared to the force-directed layout. Once the link thickness exceeds a threshold, a strongly interacting regime emerges in which multiple geometric quantities, such as the total link length and the link curvature, scale with the link thickness. We show that the crossover between the two regimes is driven by the non-crossing condition, which allows us to derive the transition point analytically and show that networks with large numbers of nodes will ultimately exist in the strongly interacting regime. We also find that networks in the weakly interacting regime display a solid-like response to stress, whereas in the strongly interacting regime they behave in a gel-like fashion. Networks in the weakly interacting regime are amenable to 3D printing and so can be used to visualize network geometry, and the strongly interacting regime provides insights into the scaling of the sizes of densely packed mammalian brains.
Asunto(s)
Modelos Estructurales , Red Nerviosa/anatomía & histología , Estrés Mecánico , Algoritmos , Animales , Axones/fisiología , Encéfalo/anatomía & histología , Encéfalo/citología , Encéfalo/fisiología , Fricción , Geles/química , Mamíferos/anatomía & histología , Modelos Biológicos , Red Nerviosa/citología , Red Nerviosa/fisiología , Impresión TridimensionalRESUMEN
The COVID-19 pandemic has highlighted the need to quickly and reliably prioritize clinically approved compounds for their potential effectiveness for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs experimentally screened in VeroE6 cells, as well as the list of drugs in clinical trials that capture the medical community's assessment of drugs with potential COVID-19 efficacy. We find that no single predictive algorithm offers consistently reliable outcomes across all datasets and metrics. This outcome prompted us to develop a multimodal technology that fuses the predictions of all algorithms, finding that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We screened in human cells the top-ranked drugs, obtaining a 62% success rate, in contrast to the 0.8% hit rate of nonguided screenings. Of the six drugs that reduced viral infection, four could be directly repurposed to treat COVID-19, proposing novel treatments for COVID-19. We also found that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these network drugs rely on network-based mechanisms that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.
Asunto(s)
Tratamiento Farmacológico de COVID-19 , Reposicionamiento de Medicamentos/métodos , Biología de Sistemas/métodos , Animales , Antivirales/administración & dosificación , Antivirales/farmacología , Antivirales/uso terapéutico , Chlorocebus aethiops , Bases de Datos Farmacéuticas , Humanos , Redes Neurales de la Computación , Unión Proteica , Células Vero , Proteínas Virales/metabolismoRESUMEN
Recent studies on the controllability of complex systems offer a powerful mathematical framework to systematically explore the structure-function relationship in biological, social, and technological networks. Despite theoretical advances, we lack direct experimental proof of the validity of these widely used control principles. Here we fill this gap by applying a control framework to the connectome of the nematode Caenorhabditis elegans, allowing us to predict the involvement of each C. elegans neuron in locomotor behaviours. We predict that control of the muscles or motor neurons requires 12 neuronal classes, which include neuronal groups previously implicated in locomotion by laser ablation, as well as one previously uncharacterized neuron, PDB. We validate this prediction experimentally, finding that the ablation of PDB leads to a significant loss of dorsoventral polarity in large body bends. Importantly, control principles also allow us to investigate the involvement of individual neurons within each neuronal class. For example, we predict that, within the class of DD motor neurons, only three (DD04, DD05, or DD06) should affect locomotion when ablated individually. This prediction is also confirmed; single cell ablations of DD04 or DD05 specifically affect posterior body movements, whereas ablations of DD02 or DD03 do not. Our predictions are robust to deletions of weak connections, missing connections, and rewired connections in the current connectome, indicating the potential applicability of this analytical framework to larger and less well-characterized connectomes.
Asunto(s)
Caenorhabditis elegans/citología , Caenorhabditis elegans/fisiología , Conectoma , Red Nerviosa/citología , Red Nerviosa/fisiología , Neuronas/fisiología , Animales , Rayos Láser , Locomoción/fisiología , Neuronas Motoras/citología , Neuronas Motoras/fisiología , Neuronas/clasificaciónRESUMEN
Despite rapid advances in connectome mapping and neuronal genetics, we lack theoretical and computational tools to unveil, in an experimentally testable fashion, the genetic mechanisms that govern neuronal wiring. Here we introduce a computational framework to link the adjacency matrix of a connectome to the expression patterns of its neurons, helping us uncover a set of genetic rules that govern the interactions between neurons in contact. The method incorporates the biological realities of the system, accounting for noise from data collection limitations, as well as spatial restrictions. The resulting methodology allows us to infer a network of 19 innexin interactions that govern the formation of gap junctions in Caenorhabditis elegans, five of which are already supported by experimental data. As advances in single-cell gene expression profiling increase the accuracy and the coverage of the data, the developed framework will allow researchers to systematically infer experimentally testable connection rules, offering mechanistic predictions for synapse and gap junction formation.
Asunto(s)
Caenorhabditis elegans/genética , Sistema Nervioso/metabolismo , Animales , Conectoma , Uniones Comunicantes/metabolismo , Modelos Neurológicos , Neuronas/metabolismoRESUMEN
There is extensive, yet fragmented, evidence of gender differences in academia suggesting that women are underrepresented in most scientific disciplines and publish fewer articles throughout a career, and their work acquires fewer citations. Here, we offer a comprehensive picture of longitudinal gender differences in performance through a bibliometric analysis of academic publishing careers by reconstructing the complete publication history of over 1.5 million gender-identified authors whose publishing career ended between 1955 and 2010, covering 83 countries and 13 disciplines. We find that, paradoxically, the increase of participation of women in science over the past 60 years was accompanied by an increase of gender differences in both productivity and impact. Most surprisingly, though, we uncover two gender invariants, finding that men and women publish at a comparable annual rate and have equivalent career-wise impact for the same size body of work. Finally, we demonstrate that differences in publishing career lengths and dropout rates explain a large portion of the reported career-wise differences in productivity and impact, although productivity differences still remain. This comprehensive picture of gender inequality in academia can help rephrase the conversation around the sustainability of women's careers in academia, with important consequences for institutions and policy makers.
Asunto(s)
Autoria , Movilidad Laboral , Publicaciones Periódicas como Asunto/estadística & datos numéricos , Ciencia/estadística & datos numéricos , Sexismo/estadística & datos numéricos , Recursos Humanos/estadística & datos numéricos , Éxito Académico , Adulto , Docentes/estadística & datos numéricos , Femenino , Humanos , Masculino , Matemática/estadística & datos numéricos , Persona de Mediana Edad , Investigadores/estadística & datos numéricosRESUMEN
Resilience, a system's ability to adjust its activity to retain its basic functionality when errors, failures and environmental changes occur, is a defining property of many complex systems. Despite widespread consequences for human health, the economy and the environment, events leading to loss of resilience--from cascading failures in technological systems to mass extinctions in ecological networks--are rarely predictable and are often irreversible. These limitations are rooted in a theoretical gap: the current analytical framework of resilience is designed to treat low-dimensional models with a few interacting components, and is unsuitable for multi-dimensional systems consisting of a large number of components that interact through a complex network. Here we bridge this theoretical gap by developing a set of analytical tools with which to identify the natural control and state parameters of a multi-dimensional complex system, helping us derive effective one-dimensional dynamics that accurately predict the system's resilience. The proposed analytical framework allows us systematically to separate the roles of the system's dynamics and topology, collapsing the behaviour of different networks onto a single universal resilience function. The analytical results unveil the network characteristics that can enhance or diminish resilience, offering ways to prevent the collapse of ecological, biological or economic systems, and guiding the design of technological systems resilient to both internal failures and environmental changes.
Asunto(s)
Ecosistema , Redes Reguladoras de Genes/genética , Modelos Biológicos , Adaptación Fisiológica , Regulación de la Expresión GénicaAsunto(s)
Genética Médica/estadística & datos numéricos , Genoma Humano/genética , Proyecto Genoma Humano , ADN Intergénico/genética , Descubrimiento de Drogas , Genes/genética , Enfermedades Genéticas Congénitas/genética , Predisposición Genética a la Enfermedad , Genética Médica/tendencias , Historia del Siglo XXI , Proyecto Genoma Humano/historia , Humanos , Terapia Molecular Dirigida , Polimorfismo de Nucleótido Simple/genética , Proteínas/genéticaRESUMEN
High-throughput technologies, offering an unprecedented wealth of quantitative data underlying the makeup of living systems, are changing biology. Notably, the systematic mapping of the relationships between biochemical entities has fueled the rapid development of network biology, offering a suitable framework to describe disease phenotypes and predict potential drug targets. However, our ability to develop accurate dynamical models remains limited, due in part to the limited knowledge of the kinetic parameters underlying these interactions. Here, we explore the degree to which we can make reasonably accurate predictions in the absence of the kinetic parameters. We find that simple dynamically agnostic models are sufficient to recover the strength and sign of the biochemical perturbation patterns observed in 87 biological models for which the underlying kinetics are known. Surprisingly, a simple distance-based model achieves 65% accuracy. We show that this predictive power is robust to topological and kinetic parameter perturbations, and we identify key network properties that can increase up to 80% the recovery rate of the true perturbation patterns. We validate our approach using experimental data on the chemotactic pathway in bacteria, finding that a network model of perturbation spreading predicts with â¼80% accuracy the directionality of gene expression and phenotype changes in knock-out and overproduction experiments. These findings show that the steady advances in mapping out the topology of biochemical interaction networks opens avenues for accurate perturbation spread modeling, with direct implications for medicine and drug development.
Asunto(s)
Bacterias/metabolismo , Quimiotaxis/fisiología , Regulación Bacteriana de la Expresión Génica/fisiología , Modelos Biológicos , Bacterias/genéticaRESUMEN
Experience plays a critical role in crafting high-impact scientific work. This is particularly evident in top multidisciplinary journals, where a scientist is unlikely to appear as senior author if he or she has not previously published within the same journal. Here, we develop a quantitative understanding of author order by quantifying this "chaperone effect," capturing how scientists transition into senior status within a particular publication venue. We illustrate that the chaperone effect has a different magnitude for journals in different branches of science, being more pronounced in medical and biological sciences and weaker in natural sciences. Finally, we show that in the case of high-impact venues, the chaperone effect has significant implications, specifically resulting in a higher average impact relative to papers authored by new principal investigators (PIs). Our findings shed light on the role played by experience in publishing within specific scientific journals, on the paths toward acquiring the necessary experience and expertise, and on the skills required to publish in prestigious venues.
RESUMEN
Governments in modern societies undertake an array of complex functions that shape politics and economics, individual and group behavior, and the natural, social, and built environment. How are governments structured to execute these diverse responsibilities? How do those structures vary, and what explains the differences? To examine these longstanding questions, we develop a technique for mapping Internet "footprint" of government with network science methods. We use this approach to describe and analyze the diversity in functional scale and structure among the 50 US state governments reflected in the webpages and links they have created online: 32.5 million webpages and 110 million hyperlinks among 47,631 agencies. We first verify that this extensive online footprint systematically reflects known characteristics: 50 hierarchically organized networks of state agencies that scale with population and are specialized around easily identifiable functions in accordance with legal mandates. We also find that the footprint reflects extensive diversity among these state functional hierarchies. We hypothesize that this variation should reflect, among other factors, state income, economic structure, ideology, and location. We find that government structures are most strongly associated with state economic structures, with location and income playing more limited roles. Voters' recent ideological preferences about the proper roles and extent of government are not significantly associated with the scale and structure of their state governments as reflected online. We conclude that the online footprint of governments offers a broad and comprehensive window on how they are structured that can help deepen understanding of those structures.