RESUMEN
Major depression (MD) and obesity are complex genetic disorders that are frequently comorbid. However, the study of both diseases concurrently remains poorly addressed and therefore the underlying genetic mechanisms involved in this comorbidity remain largely unknown. Here we examine the contribution of common and rare variants to this comorbidity through a next-generation sequencing (NGS) approach. Specific genomic regions of interest in MD and obesity were sequenced in a group of 654 individuals from the PISMA-ep epidemiological study. We obtained variants across the entire frequency spectrum and assessed their association with comorbid MD and obesity, both at variant and gene levels. We identified 55 independent common variants and a burden of rare variants in 4 genes (PARK2, FGF21, HIST1H3D and RSRC1) associated with the comorbid phenotype. Follow-up analyses revealed significantly enriched gene-sets associated with biological processes and pathways involved in metabolic dysregulation, hormone signaling and cell cycle regulation. Our results suggest that, while risk variants specific to the comorbid phenotype have been identified, the genes functionally impacted by the risk variants share cell biological processes and signaling pathways with MD and obesity phenotypes separately. To the best of our knowledge, this is the first study involving a targeted sequencing approach toward the study of the comorbid MD and obesity. The framework presented here allowed a deep characterization of the genetics of the co-occurring MD and obesity, revealing insights into the mutational and functional profile that underlies this comorbidity and contributing to a better understanding of the relationship between these two disabling disorders.
RESUMEN
We introduce drexml, a command line tool and Python package for rational data-driven drug repurposing. The package employs machine learning and mechanistic signal transduction modeling to identify drug targets capable of regulating a particular disease. In addition, it employs explainability tools to contextualize potential drug targets within the functional landscape of the disease. The methodology is validated in Fanconi Anemia and Familial Melanoma, two distinct rare diseases where there is a pressing need for solutions. In the Fanconi Anemia case, the model successfully predicts previously validated repurposed drugs, while in the Familial Melanoma case, it identifies a promising set of drugs for further investigation.
RESUMEN
BACKGROUND: Retinitis pigmentosa is the prevailing genetic cause of blindness in developed nations with no effective treatments. In the pursuit of unraveling the intricate dynamics underlying this complex disease, mechanistic models emerge as a tool of proven efficiency rooted in systems biology, to elucidate the interplay between RP genes and their mechanisms. The integration of mechanistic models and drug-target interactions under the umbrella of machine learning methodologies provides a multifaceted approach that can boost the discovery of novel therapeutic targets, facilitating further drug repurposing in RP. METHODS: By mapping Retinitis Pigmentosa-related genes (obtained from Orphanet, OMIM and HPO databases) onto KEGG signaling pathways, a collection of signaling functional circuits encompassing Retinitis Pigmentosa molecular mechanisms was defined. Next, a mechanistic model of the so-defined disease map, where the effects of interventions can be simulated, was built. Then, an explainable multi-output random forest regressor was trained using normal tissue transcriptomic data to learn causal connections between targets of approved drugs from DrugBank and the functional circuits of the mechanistic disease map. Selected target genes involvement were validated on rd10 mice, a murine model of Retinitis Pigmentosa. RESULTS: A mechanistic functional map of Retinitis Pigmentosa was constructed resulting in 226 functional circuits belonging to 40 KEGG signaling pathways. The method predicted 109 targets of approved drugs in use with a potential effect over circuits corresponding to nine hallmarks identified. Five of those targets were selected and experimentally validated in rd10 mice: Gabre, Gabra1 (GABARα1 protein), Slc12a5 (KCC2 protein), Grin1 (NR1 protein) and Glr2a. As a result, we provide a resource to evaluate the potential impact of drug target genes in Retinitis Pigmentosa. CONCLUSIONS: The possibility of building actionable disease models in combination with machine learning algorithms to learn causal drug-disease interactions opens new avenues for boosting drug discovery. Such mechanistically-based hypotheses can guide and accelerate the experimental validations prioritizing drug target candidates. In this work, a mechanistic model describing the functional disease map of Retinitis Pigmentosa was developed, identifying five promising therapeutic candidates targeted by approved drug. Further experimental validation will demonstrate the efficiency of this approach for a systematic application to other rare diseases.
Asunto(s)
Retinitis Pigmentosa , Ratones , Animales , Retinitis Pigmentosa/tratamiento farmacológico , Retinitis Pigmentosa/genética , Retinitis Pigmentosa/metabolismo , Transducción de SeñalRESUMEN
BACKGROUND: The introduction of direct-acting oral anticoagulants (DOACs) has shown to decrease atrial fibrillation (AF)-related stroke and bleeding rates in clinical studies, but there is no certain evidence about their effects at the population level. Our aim was to assess changes in AF-related stroke and major bleeding rates between 2012 and 2019 in Andalusia (Spain), and the association between DOACs use and events rates at the population level. METHODS: All patients with an AF diagnosis from 2012 to 2019 were identified using the Andalusian Health Population Base, that provides clinical information on all Andalusian people. Annual ischemic and hemorrhagic stroke, major bleeding rates, and used antithrombotic treatments were determined. Marginal hazard ratios (HR) were calculated for each treatment. RESULTS: A total of 95,085 patients with an AF diagnosis were identified. Mean age was 76.1±10.2 years (49.7% women). An increase in the use of DOACs was observed throughout the study period in both males and females (p<0.001). The annual rate of ischemic stroke decreased by one third, while that of hemorrhagic stroke and major bleeding decreased 2-3-fold from 2012 to 2019. Marginal HR was lower than 0.50 for DOACs compared to VKA for all ischemic or hemorrhagic events. CONCLUSIONS: In this contemporary population-based study using clinical and administrative databases in Andalusia, a significant reduction in the incidence of AF-related ischemic and hemorrhagic stroke and major bleeding was observed between 2012 and 2019. The increased use of DOACs seems to be associated with this reduction.
Asunto(s)
Fibrilación Atrial , Pueblo Europeo , Accidente Cerebrovascular Hemorrágico , Accidente Cerebrovascular , Masculino , Humanos , Femenino , Anciano , Anciano de 80 o más Años , Fibrilación Atrial/complicaciones , Fibrilación Atrial/tratamiento farmacológico , Fibrilación Atrial/epidemiología , Anticoagulantes/efectos adversos , Hemorragia/inducido químicamente , Hemorragia/epidemiología , Hemorragia/tratamiento farmacológico , Accidente Cerebrovascular/epidemiología , Accidente Cerebrovascular/etiología , Accidente Cerebrovascular/prevención & control , Administración OralRESUMEN
Soft tissue sarcoma is an umbrella term for a group of rare cancers that are difficult to treat. In addition to surgery, neoadjuvant chemotherapy has shown the potential to downstage tumors and prevent micrometastases. However, finding effective therapeutic targets remains a research challenge. Here, a previously developed computational approach called mechanistic models of signaling pathways has been employed to unravel the impact of observed changes at the gene expression level on the ultimate functional behavior of cells. In the context of such a mechanistic model, RNA-Seq counts sourced from the Recount3 resource, from The Cancer Genome Atlas (TCGA) Sarcoma project, and non-diseased sarcomagenic tissues from the Genotype-Tissue Expression (GTEx) project were utilized to investigate signal transduction activity through signaling pathways. This approach provides a precise view of the relationship between sarcoma patient survival and the signaling landscape in tumors and their environment. Despite the distinct regulatory alterations observed in each sarcoma subtype, this study identified 13 signaling circuits, or elementary sub-pathways triggering specific cell functions, present across all subtypes, belonging to eight signaling pathways, which served as predictors for patient survival. Additionally, nine signaling circuits from five signaling pathways that highlighted the modifications tumor samples underwent in comparison to normal tissues were found. These results describe the protective role of the immune system, suggesting an anti-tumorigenic effect in the tumor microenvironment, in the process of tumor cell detachment and migration, or the dysregulation of ion homeostasis. Also, the analysis of signaling circuit intermediary proteins suggests multiple strategies for therapy.
Asunto(s)
Sarcoma , Neoplasias de los Tejidos Blandos , Humanos , Sarcoma/patología , RNA-Seq , Perfilación de la Expresión Génica , Microambiente Tumoral/genéticaRESUMEN
PURPOSE: Despite the extensive vaccination campaigns in many countries, COVID-19 is still a major worldwide health problem because of its associated morbidity and mortality. Therefore, finding efficient treatments as fast as possible is a pressing need. Drug repurposing constitutes a convenient alternative when the need for new drugs in an unexpected medical scenario is urgent, as is the case with COVID-19. METHODS: Using data from a central registry of electronic health records (the Andalusian Population Health Database), the effect of prior consumption of drugs for other indications previous to the hospitalization with respect to patient outcomes, including survival and lymphocyte progression, was studied on a retrospective cohort of 15,968 individuals, comprising all COVID-19 patients hospitalized in Andalusia between January and November 2020. RESULTS: Covariate-adjusted hazard ratios and analysis of lymphocyte progression curves support a significant association between consumption of 21 different drugs and better patient survival. Contrarily, one drug, furosemide, displayed a significant increase in patient mortality. CONCLUSIONS: In this study we have taken advantage of the availability of a regional clinical database to study the effect of drugs, which patients were taking for other indications, on their survival. The large size of the database allowed us to control covariates effectively.
Asunto(s)
COVID-19 , Humanos , Estudios Retrospectivos , COVID-19/epidemiología , Resultado del Tratamiento , Bases de Datos Factuales , FurosemidaRESUMEN
Single-cell RNA sequencing is increasing our understanding of the behavior of complex tissues or organs, by providing unprecedented details on the complex cell type landscape at the level of individual cells. Cell type definition and functional annotation are key steps to understanding the molecular processes behind the underlying cellular communication machinery. However, the exponential growth of scRNA-seq data has made the task of manually annotating cells unfeasible, due not only to an unparalleled resolution of the technology but to an ever-increasing heterogeneity of the data. Many supervised and unsupervised methods have been proposed to automatically annotate cells. Supervised approaches for cell-type annotation outperform unsupervised methods except when new (unknown) cell types are present. Here, we introduce SigPrimedNet an artificial neural network approach that leverages (i) efficient training by means of a sparsity-inducing signaling circuits-informed layer, (ii) feature representation learning through supervised training, and (iii) unknown cell-type identification by fitting an anomaly detection method on the learned representation. We show that SigPrimedNet can efficiently annotate known cell types while keeping a low false-positive rate for unseen cells across a set of publicly available datasets. In addition, the learned representation acts as a proxy for signaling circuit activity measurements, which provide useful estimations of the cell functionalities.
RESUMEN
The reprogramming of metabolism is a recognized cancer hallmark. It is well known that different signaling pathways regulate and orchestrate this reprogramming that contributes to cancer initiation and development. However, recent evidence is accumulating, suggesting that several metabolites could play a relevant role in regulating signaling pathways. To assess the potential role of metabolites in the regulation of signaling pathways, both metabolic and signaling pathway activities of Breast invasive Carcinoma (BRCA) have been modeled using mechanistic models. Gaussian Processes, powerful machine learning methods, were used in combination with SHapley Additive exPlanations (SHAP), a recent methodology that conveys causality, to obtain potential causal relationships between the production of metabolites and the regulation of signaling pathways. A total of 317 metabolites were found to have a strong impact on signaling circuits. The results presented here point to the existence of a complex crosstalk between signaling and metabolic pathways more complex than previously was thought.
Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Neoplasias de la Mama/metabolismo , Transducción de Señal , Aprendizaje Automático , Redes y Vías MetabólicasRESUMEN
OBJECTIVES: More than two years into the COVID-19 pandemic, SARS-CoV-2 still remains a global public health problem. Successive waves of infection have produced new SARS-CoV-2 variants with new mutations for which the impact on COVID-19 severity and patient survival is uncertain. METHODS: A total of 764 SARS-CoV-2 genomes, sequenced from COVID-19 patients, hospitalized from 19th February 2020 to 30 April 2021, along with their clinical data, were used for survival analysis. RESULTS: A significant association of B.1.1.7, the alpha lineage, with patient mortality (log hazard ratio (LHR) = 0.51, C.I. = [0.14,0.88]) was found upon adjustment by all the covariates known to affect COVID-19 prognosis. Moreover, survival analysis of mutations in the SARS-CoV-2 genome revealed 27 of them were significantly associated with higher mortality of patients. Most of these mutations were located in the genes coding for the S, ORF8, and N proteins. CONCLUSIONS: This study illustrates how a combination of genomic and clinical data can provide solid evidence for the impact of viral lineage on patient survival.
Asunto(s)
COVID-19 , SARS-CoV-2 , Genoma Viral , Humanos , Mutación , Pandemias , Filogenia , SARS-CoV-2/genéticaRESUMEN
Gut microbiome is gaining interest because of its links with several diseases, including colorectal cancer (CRC), as well as the possibility of being used to obtain non-intrusive predictive disease biomarkers. Here we performed a meta-analysis of 1042 fecal metagenomic samples from seven publicly available studies. We used an interpretable machine learning approach based on functional profiles, instead of the conventional taxonomic profiles, to produce a highly accurate predictor of CRC with better precision than those of previous proposals. Moreover, this approach is also able to discriminate samples with adenoma, which makes this approach very promising for CRC prevention by detecting early stages in which intervention is easier and more effective. In addition, interpretable machine learning methods allow extracting features relevant for the classification, which reveals basic molecular mechanisms accounting for the changes undergone by the microbiome functional landscape in the transition from healthy gut to adenoma and CRC conditions. Functional profiles have demonstrated superior accuracy in predicting CRC and adenoma conditions than taxonomic profiles and additionally, in a context of explainable machine learning, provide useful hints on the molecular mechanisms operating in the microbiota behind these conditions.
Asunto(s)
Adenoma/microbiología , Neoplasias Colorrectales/microbiología , Microbioma Gastrointestinal , Aprendizaje Automático , Metagenómica/métodos , HumanosRESUMEN
Recent studies have demonstrated a relevant role of the host genetics in the coronavirus disease 2019 (COVID-19) prognosis. Most of the 7000 rare diseases described to date have a genetic component, typically highly penetrant. However, this vast spectrum of genetic variability remains yet unexplored with respect to possible interactions with COVID-19. Here, a mathematical mechanistic model of the COVID-19 molecular disease mechanism has been used to detect potential interactions between rare disease genes and the COVID-19 infection process and downstream consequences. Out of the 2518 disease genes analyzed, causative of 3854 rare diseases, a total of 254 genes have a direct effect on the COVID-19 molecular disease mechanism and 207 have an indirect effect revealed by a significant strong correlation. This remarkable potential of interaction occurs for >300 rare diseases. Mechanistic modeling of COVID-19 disease map has allowed a holistic systematic analysis of the potential interactions between the loss of function in known rare disease genes and the pathological consequences of COVID-19 infection. The results identify links between disease genes and COVID-19 hallmarks and demonstrate the usefulness of the proposed approach for future preventive measures in some rare diseases.
Asunto(s)
COVID-19 , Virosis , COVID-19/genética , Humanos , Modelos Estadísticos , Enfermedades Raras/genéticaRESUMEN
BACKGROUND: Single-cell RNA sequencing (scRNA-seq) data provide valuable insights into cellular heterogeneity which is significantly improving the current knowledge on biology and human disease. One of the main applications of scRNA-seq data analysis is the identification of new cell types and cell states. Deep neural networks (DNNs) are among the best methods to address this problem. However, this performance comes with the trade-off for a lack of interpretability in the results. In this work we propose an intelligible pathway-driven neural network to correctly solve cell-type related problems at single-cell resolution while providing a biologically meaningful representation of the data. RESULTS: In this study, we explored the deep neural networks constrained by several types of prior biological information, e.g. signaling pathway information, as a way to reduce the dimensionality of the scRNA-seq data. We have tested the proposed biologically-based architectures on thousands of cells of human and mouse origin across a collection of public datasets in order to check the performance of the model. Specifically, we tested the architecture across different validation scenarios that try to mimic how unknown cell types are clustered by the DNN and how it correctly annotates cell types by querying a database in a retrieval problem. Moreover, our approach demonstrated to be comparable to other less interpretable DNN approaches constrained by using protein-protein interactions gene regulation data. Finally, we show how the latent structure learned by the network could be used to visualize and to interpret the composition of human single cell datasets. CONCLUSIONS: Here we demonstrate how the integration of pathways, which convey fundamental information on functional relationships between genes, with DNNs, that provide an excellent classification framework, results in an excellent alternative to learn a biologically meaningful representation of scRNA-seq data. In addition, the introduction of prior biological knowledge in the DNN reduces the size of the network architecture. Comparative results demonstrate a superior performance of this approach with respect to other similar approaches. As an additional advantage, the use of pathways within the DNN structure enables easy interpretability of the results by connecting features to cell functionalities by means of the pathway nodes, as demonstrated with an example with human melanoma tumor cells.
RESUMEN
Retinitis pigmentosa (RP) is the most common inherited retinal dystrophy causing progressive vision loss. It is accompanied by chronic and sustained inflammation, including M1 microglia activation. This study evaluated the effect of an essential fatty acid (EFA) supplement containing specialized pro-resolving mediators (SPMs), on retinal degeneration and microglia activation in rd10 mice, a model of RP, as well as on LPS-stimulated BV2 cells. The EFA supplement was orally administered to mice from postnatal day (P)9 to P18. At P18, the electrical activity of the retina was examined by electroretinography (ERG) and innate behavior in response to light were measured. Retinal degeneration was studied via histology including the TUNEL assay and microglia immunolabeling. Microglia polarization (M1/M2) was assessed by flow cytometry, qPCR, ELISA and histology. Redox status was analyzed by measuring antioxidant enzymes and markers of oxidative damage. Interestingly, the EFA supplement ameliorated retinal dysfunction and degeneration by improving ERG recording and sensitivity to light, and reducing photoreceptor cell loss. The EFA supplement reduced inflammation and microglia activation attenuating M1 markers as well as inducing a shift to the M2 phenotype in rd10 mouse retinas and LPS-stimulated BV2 cells. It also reduced oxidative stress markers of lipid peroxidation and carbonylation. These findings could open up new therapeutic opportunities based on resolving inflammation with oral supplementation with SPMs such as the EFA supplement.
RESUMEN
BACKGROUND: The current SARS-CoV-2 pandemic has emphasized the utility of viral whole-genome sequencing in the surveillance and control of the pathogen. An unprecedented ongoing global initiative is producing hundreds of thousands of sequences worldwide. However, the complex circumstances in which viruses are sequenced, along with the demand of urgent results, causes a high rate of incomplete and, therefore, useless sequences. Viral sequences evolve in the context of a complex phylogeny and different positions along the genome are in linkage disequilibrium. Therefore, an imputation method would be able to predict missing positions from the available sequencing data. RESULTS: We have developed the impuSARS application, which takes advantage of the enormous number of SARS-CoV-2 genomes available, using a reference panel containing 239,301 sequences, to produce missing data imputation in viral genomes. ImpuSARS was tested in a wide range of conditions (continuous fragments, amplicons or sparse individual positions missing), showing great fidelity when reconstructing the original sequences, recovering the lineage with a 100% precision for almost all the lineages, even in very poorly covered genomes (<20%). CONCLUSIONS: Imputation can improve the pace of SARS-CoV-2 sequencing production by recovering many incomplete or low-quality sequences that would be otherwise discarded. ImpuSARS can be incorporated in any primary data processing pipeline for SARS-CoV-2 whole-genome sequencing.
Asunto(s)
Genoma Viral , SARS-CoV-2 , Filogenia , SARS-CoV-2/genética , Secuenciación Completa del GenomaRESUMEN
COVID-19 is a major worldwide health problem because of acute respiratory distress syndrome, and mortality. Several lines of evidence have suggested a relationship between the vitamin D endocrine system and severity of COVID-19. We present a survival study on a retrospective cohort of 15,968 patients, comprising all COVID-19 patients hospitalized in Andalusia between January and November 2020. Based on a central registry of electronic health records (the Andalusian Population Health Database, BPS), prescription of vitamin D or its metabolites within 15-30 days before hospitalization were recorded. The effect of prescription of vitamin D (metabolites) for other indication previous to the hospitalization was studied with respect to patient survival. Kaplan-Meier survival curves and hazard ratios support an association between prescription of these metabolites and patient survival. Such association was stronger for calcifediol (Hazard Ratio, HR = 0.67, with 95% confidence interval, CI, of [0.50-0.91]) than for cholecalciferol (HR = 0.75, with 95% CI of [0.61-0.91]), when prescribed 15 days prior hospitalization. Although the relation is maintained, there is a general decrease of this effect when a longer period of 30 days prior hospitalization is considered (calcifediol HR = 0.73, with 95% CI [0.57-0.95] and cholecalciferol HR = 0.88, with 95% CI [0.75, 1.03]), suggesting that association was stronger when the prescription was closer to the hospitalization.
Asunto(s)
COVID-19/mortalidad , Calcifediol/uso terapéutico , Vitamina D/uso terapéutico , Femenino , Humanos , Estimación de Kaplan-Meier , Masculino , Estudios Retrospectivos , España/epidemiología , Análisis de SupervivenciaRESUMEN
Genome-scale mechanistic models of pathways are gaining importance for genomic data interpretation because they provide a natural link between genotype measurements (transcriptomics or genomics data) and the phenotype of the cell (its functional behavior). Moreover, mechanistic models can be used to predict the potential effect of interventions, including drug inhibitions. Here, we present the implementation of a mechanistic model of cell signaling for the interpretation of transcriptomic data as an R/Bioconductor package, a Cytoscape plugin and a web tool with enhanced functionality which includes building interpretable predictors, estimation of the effect of perturbations and assessment of the effect of mutations in complex scenarios.
RESUMEN
Here we present a web interface that implements a comprehensive mechanistic model of the SARS-CoV-2 disease map. In this framework, the detailed activity of the human signaling circuits related to the viral infection, covering from the entry and replication mechanisms to the downstream consequences as inflammation and antigenic response, can be inferred from gene expression experiments. Moreover, the effect of potential interventions, such as knock-downs, or drug effects (currently the system models the effect of more than 8000 DrugBank drugs) can be studied. This freely available tool not only provides an unprecedentedly detailed view of the mechanisms of viral invasion and the consequences in the cell but has also the potential of becoming an invaluable asset in the search for efficient antiviral treatments.
RESUMEN
The knowledge of the genetic variability of the local population is of utmost importance in personalized medicine and has been revealed as a critical factor for the discovery of new disease variants. Here, we present the Collaborative Spanish Variability Server (CSVS), which currently contains more than 2000 genomes and exomes of unrelated Spanish individuals. This database has been generated in a collaborative crowdsourcing effort collecting sequencing data produced by local genomic projects and for other purposes. Sequences have been grouped by ICD10 upper categories. A web interface allows querying the database removing one or more ICD10 categories. In this way, aggregated counts of allele frequencies of the pseudo-control Spanish population can be obtained for diseases belonging to the category removed. Interestingly, in addition to pseudo-control studies, some population studies can be made, as, for example, prevalence of pharmacogenomic variants, etc. In addition, this genomic data has been used to define the first Spanish Genome Reference Panel (SGRP1.0) for imputation. This is the first local repository of variability entirely produced by a crowdsourcing effort and constitutes an example for future initiatives to characterize local variability worldwide. CSVS is also part of the GA4GH Beacon network. CSVS can be accessed at: http://csvs.babelomics.org/.
Asunto(s)
Colaboración de las Masas , Bases de Datos Genéticas , Genética de Población/métodos , Genoma Humano , Programas Informáticos , Alelos , Mapeo Cromosómico , Exoma , Frecuencia de los Genes , Variación Genética , Genómica , Humanos , Internet , Medicina de Precisión/métodos , EspañaAsunto(s)
Tratamiento Farmacológico de COVID-19 , Reposicionamiento de Medicamentos , Proteínas/antagonistas & inhibidores , SARS-CoV-2/efectos de los fármacos , COVID-19/patología , COVID-19/virología , Química Computacional , Humanos , Aprendizaje Automático , Simulación del Acoplamiento Molecular , Terapia Molecular Dirigida , Proteínas/química , SARS-CoV-2/patogenicidad , Transducción de Señal/efectos de los fármacosRESUMEN
Spinal muscular atrophy (SMA) is a severe neuromuscular autosomal recessive disorder affecting 1/10,000 live births. Most SMA patients present homozygous deletion of SMN1, while the vast majority of SMA carriers present only a single SMN1 copy. The sequence similarity between SMN1 and SMN2, and the complexity of the SMN locus makes the estimation of the SMN1 copy-number by next-generation sequencing (NGS) very difficult. Here, we present SMAca, the first python tool to detect SMA carriers and estimate the absolute SMN1 copy-number using NGS data. Moreover, SMAca takes advantage of the knowledge of certain variants specific to SMN1 duplication to also identify silent carriers. This tool has been validated with a cohort of 326 samples from the Navarra 1000 Genomes Project (NAGEN1000). SMAca was developed with a focus on execution speed and easy installation. This combination makes it especially suitable to be integrated into production NGS pipelines. Source code and documentation are available at https://www.github.com/babelomics/SMAca.