Búsqueda | OPS/OMS Uruguay

1.

Efficient minimizer orders for large values of k using minimum decycling sets.

Pellow, David; Pu, Lianrong; Ekim, Baris; Kotlar, Lior; Berger, Bonnie; Shamir, Ron; Orenstein, Yaron.

Genome Res ; 33(7): 1154-1161, 2023 07.

Artículo en Inglés | MEDLINE | ID: mdl-37558282

RESUMEN

Minimizers are ubiquitously used in data structures and algorithms for efficient searching, mapping, and indexing of high-throughput DNA sequencing data. Minimizer schemes select a minimum k-mer in every L-long subsequence of the target sequence, where minimality is with respect to a predefined k-mer order. Commonly used minimizer orders select more k-mers than necessary and therefore provide limited improvement in runtime and memory usage of downstream analysis tasks. The recently introduced universal k-mer hitting sets produce minimizer orders with fewer selected k-mers. Generating compact universal k-mer hitting sets is currently infeasible for k > 13, and thus, they cannot help in the many applications that require minimizer orders for larger k Here, we close the gap of efficient minimizer orders for large values of k by introducing decycling-set-based minimizer orders: new minimizer orders based on minimum decycling sets. We show that in practice these new minimizer orders select a number of k-mers comparable to that of minimizer orders based on universal k-mer hitting sets and can also scale to a larger k Furthermore, we developed a method that computes the minimizers in a sequence on the fly without keeping the k-mers of a decycling set in memory. This enables the use of these minimizer orders for any value of k We expect the new orders to improve the runtime and memory usage of algorithms and data structures in high-throughput DNA sequencing analysis.

Asunto(s)

Algoritmos , Programas Informáticos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos

2.

Drugst.One - a plug-and-play solution for online systems medicine and network-based drug repurposing.

Maier, Andreas; Hartung, Michael; Abovsky, Mark; Adamowicz, Klaudia; Bader, Gary D; Baier, Sylvie; Blumenthal, David B; Chen, Jing; Elkjaer, Maria L; Garcia-Hernandez, Carlos; Helmy, Mohamed; Hoffmann, Markus; Jurisica, Igor; Kotlyar, Max; Lazareva, Olga; Levi, Hagai; List, Markus; Lobentanzer, Sebastian; Loscalzo, Joseph; Malod-Dognin, Noel; Manz, Quirin; Matschinske, Julian; Mee, Miles; Oubounyt, Mhaned; Pastrello, Chiara; Pico, Alexander R; Pillich, Rudolf T; Poschenrieder, Julian M; Pratt, Dexter; Przulj, Natasa; Sadegh, Sepideh; Saez-Rodriguez, Julio; Sarkar, Suryadipto; Shaked, Gideon; Shamir, Ron; Trummer, Nico; Turhan, Ugur; Wang, Rui-Sheng; Zolotareva, Olga; Baumbach, Jan.

Nucleic Acids Res ; 52(W1): W481-W488, 2024 Jul 05.

Artículo en Inglés | MEDLINE | ID: mdl-38783119

RESUMEN

In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.

Asunto(s)

Reposicionamiento de Medicamentos , Programas Informáticos , Reposicionamiento de Medicamentos/métodos , Humanos , Internet , Descubrimiento de Drogas/métodos , Biología de Sistemas/métodos , Biología Computacional/métodos

3.

The predictive capacity of polygenic risk scores for disease risk is only moderately influenced by imputation panels tailored to the target population.

Levi, Hagai; Elkon, Ran; Shamir, Ron.

Bioinformatics ; 40(2)2024 02 01.

Artículo en Inglés | MEDLINE | ID: mdl-38265251

RESUMEN

MOTIVATION: Polygenic risk scores (PRSs) predict individuals' genetic risk of developing complex diseases. They summarize the effect of many variants discovered in genome-wide association studies (GWASs). However, to date, large GWASs exist primarily for the European population and the quality of PRS prediction declines when applied to other ethnicities. Genetic profiling of individuals in the discovery set (on which the GWAS was performed) and target set (on which the PRS is applied) is typically done by SNP arrays that genotype a fraction of common SNPs. Therefore, a key step in GWAS analysis and PRS calculation is imputing untyped SNPs using a panel of fully sequenced individuals. The imputation results depend on the ethnic composition of the imputation panel. Imputing genotypes with a panel of individuals of the same ethnicity as the genotyped individuals typically improves imputation accuracy. However, there has been no systematic investigation into the influence of the ethnic composition of imputation panels on the accuracy of PRS predictions when applied to ethnic groups that differ from the population used in the GWAS. RESULTS: We estimated the effect of imputation of the target set on prediction accuracy of PRS when the discovery and the target sets come from different ethnic groups. We analyzed binary phenotypes on ethnically distinct sets from the UK Biobank and other resources. We generated ethnically homogenous panels, imputed the target sets, and generated PRSs. Then, we assessed the prediction accuracy obtained from each imputation panel. Our analysis indicates that using an imputation panel matched to the ethnicity of the target population yields only a marginal improvement and only under specific conditions. AVAILABILITY AND IMPLEMENTATION: The source code used for executing the analyses is this paper is available at https://github.com/Shamir-Lab/PRS-imputation-panels.

Asunto(s)

Puntuación de Riesgo Genético , Estudio de Asociación del Genoma Completo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Fenotipo , Programas Informáticos , Polimorfismo de Nucleótido Simple

4.

Integration of gene expression and DNA methylation data across different experiments.

Itai, Yonatan; Rappoport, Nimrod; Shamir, Ron.

Nucleic Acids Res ; 51(15): 7762-7776, 2023 08 25.

Artículo en Inglés | MEDLINE | ID: mdl-37395437

RESUMEN

Integrative analysis of multi-omic datasets has proven to be extremely valuable in cancer research and precision medicine. However, obtaining multimodal data from the same samples is often difficult. Integrating multiple datasets of different omics remains a challenge, with only a few available algorithms developed to solve it. Here, we present INTEND (IntegratioN of Transcriptomic and EpigeNomic Data), a novel algorithm for integrating gene expression and DNA methylation datasets covering disjoint sets of samples. To enable integration, INTEND learns a predictive model between the two omics by training on multi-omic data measured on the same set of samples. In comprehensive testing on 11 TCGA (The Cancer Genome Atlas) cancer datasets spanning 4329 patients, INTEND achieves significantly superior results compared with four state-of-the-art integration algorithms. We also demonstrate INTEND's ability to uncover connections between DNA methylation and the regulation of gene expression in the joint analysis of two lung adenocarcinoma single-omic datasets from different sources. INTEND's data-driven approach makes it a valuable multi-omic data integration tool. The code for INTEND is available at https://github.com/Shamir-Lab/INTEND.

Asunto(s)

Metilación de ADN , Neoplasias , Humanos , Metilación de ADN/genética , Neoplasias/genética , Algoritmos , Perfilación de la Expresión Génica , Transcriptoma/genética

5.

CT-FOCS: a novel method for inferring cell type-specific enhancer-promoter maps.

Hait, Tom Aharon; Elkon, Ran; Shamir, Ron.

Nucleic Acids Res ; 50(10): e55, 2022 06 10.

Artículo en Inglés | MEDLINE | ID: mdl-35100425

RESUMEN

Spatiotemporal gene expression patterns are governed to a large extent by the activity of enhancer elements, which engage in physical contacts with their target genes. Identification of enhancer-promoter (EP) links that are functional only in a specific subset of cell types is a key challenge in understanding gene regulation. We introduce CT-FOCS (cell type FOCS), a statistical inference method that uses linear mixed effect models to infer EP links that show marked activity only in a single or a small subset of cell types out of a large panel of probed cell types. Analyzing 808 samples from FANTOM5, covering 472 cell lines, primary cells and tissues, CT-FOCS inferred such EP links more accurately than recent state-of-the-art methods. Furthermore, we show that strictly cell type-specific EP links are very uncommon in the human genome.

Asunto(s)

Elementos de Facilitación Genéticos , Regiones Promotoras Genéticas , Regulación de la Expresión Génica , Genoma Humano , Humanos , Análisis de la Célula Individual

6.

3CAC: improving the classification of phages and plasmids in metagenomic assemblies using assembly graphs.

Pu, Lianrong; Shamir, Ron.

Bioinformatics ; 38(Suppl_2): ii56-ii61, 2022 09 16.

Artículo en Inglés | MEDLINE | ID: mdl-36124804

RESUMEN

MOTIVATION: Bacteriophages and plasmids usually coexist with their host bacteria in microbial communities and play important roles in microbial evolution. Accurately identifying sequence contigs as phages, plasmids and bacterial chromosomes in mixed metagenomic assemblies is critical for further unraveling their functions. Many classification tools have been developed for identifying either phages or plasmids in metagenomic assemblies. However, only two classifiers, PPR-Meta and viralVerify, were proposed to simultaneously identify phages and plasmids in mixed metagenomic assemblies. Due to the very high fraction of chromosome contigs in the assemblies, both tools achieve high precision in the classification of chromosomes but perform poorly in classifying phages and plasmids. Short contigs in these assemblies are often wrongly classified or classified as uncertain. RESULTS: Here we present 3CAC, a new three-class classifier that improves the precision of phage and plasmid classification. 3CAC starts with an initial three-class classification generated by existing classifiers and improves the classification of short contigs and contigs with low confidence classification by using proximity in the assembly graph. Evaluation on simulated metagenomes and on real human gut microbiome samples showed that 3CAC outperformed PPR-Meta and viralVerify in both precision and recall, and increased F1-score by 10-60 percentage points. AVAILABILITY AND IMPLEMENTATION: The 3CAC software is available on https://github.com/Shamir-Lab/3CAC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Bacteriófagos , Metagenoma , Bacteriófagos/genética , Humanos , Metagenómica , Plásmidos/genética , Programas Informáticos

7.

The DOMINO web-server for active module identification analysis.

Levi, Hagai; Rahmanian, Nima; Elkon, Ran; Shamir, Ron.

Bioinformatics ; 38(8): 2364-2366, 2022 04 12.

Artículo en Inglés | MEDLINE | ID: mdl-35139202

RESUMEN

MOTIVATION: Active module identification (AMI) is an essential step in many omics analyses. Such algorithms receive a gene network and a gene activity profile as input and report subnetworks that show significant over-representation of accrued activity signal ('active modules'). Such modules can point out key molecular processes in the analyzed biological conditions. RESULTS: We recently introduced a novel AMI algorithm called DOMINO and demonstrated that it detects active modules that capture biological signals with markedly improved rate of empirical validation. Here, we provide an online server that executes DOMINO, making it more accessible and user-friendly. To help the interpretation of solutions, the server provides GO enrichment analysis, module visualizations and accessible output formats for customized downstream analysis. It also enables running DOMINO with various gene identifiers of different organisms. AVAILABILITY AND IMPLEMENTATION: The server is available at http://domino.cs.tau.ac.il. Its codebase is available at https://github.com/Shamir-Lab.

Asunto(s)

Algoritmos , Programas Informáticos , Computadores , Redes Reguladoras de Genes , Internet

8.

Parameterized syncmer schemes improve long-read mapping.

Dutta, Abhinav; Pellow, David; Shamir, Ron.

PLoS Comput Biol ; 18(10): e1010638, 2022 10.

Artículo en Inglés | MEDLINE | ID: mdl-36306319

RESUMEN

MOTIVATION: Sequencing long reads presents novel challenges to mapping. One such challenge is low sequence similarity between the reads and the reference, due to high sequencing error and mutation rates. This occurs, e.g., in a cancer tumor, or due to differences between strains of viruses or bacteria. A key idea in mapping algorithms is to sketch sequences with their minimizers. Recently, syncmers were introduced as an alternative sketching method that is more robust to mutations and sequencing errors. RESULTS: We introduce parameterized syncmer schemes (PSS), a generalization of syncmers, and provide a theoretical analysis for multi-parameter schemes. By combining PSS with downsampling or minimizers we can achieve any desired compression and window guarantee. We implemented the use of PSS in the popular minimap2 and Winnowmap2 mappers. In tests on simulated and real long-read data from a variety of genomes, the PSS-based algorithms, with scheme parameters selected on the basis of our theoretical analysis, reduced unmapped reads by 20-60% at high compression while usually using less memory. The advantage was more pronounced at low sequence identity. At sequence identity of 75% and medium compression, PSS-minimap had only 37% as many unmapped reads, and 8% fewer of the reads that did map were incorrectly mapped. Even at lower compression and error rates, PSS-based mapping mapped more reads than the original minimizer-based mappers as well as mappers using the original syncmer schemes. We conclude that using PSS can improve mapping of long reads in a wide range of settings.

Asunto(s)

Compresión de Datos , Programas Informáticos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Compresión de Datos/métodos , Algoritmos

9.

Sorting cancer karyotypes using double-cut-and-joins, duplications and deletions.

Zeira, Ron; Shamir, Ron.

Bioinformatics ; 37(11): 1489-1496, 2021 Jul 12.

Artículo en Inglés | MEDLINE | ID: mdl-29726899

RESUMEN

MOTIVATION: Problems of genome rearrangement are central in both evolution and cancer research. Most genome rearrangement models assume that the genome contains a single copy of each gene and the only changes in the genome are structural, i.e. reordering of segments. In contrast, tumor genomes also undergo numerical changes such as deletions and duplications, and thus the number of copies of genes varies. Dealing with unequal gene content is a very challenging task, addressed by few algorithms to date. More realistic models are needed to help trace genome evolution during tumorigenesis. RESULTS: Here, we present a model for the evolution of genomes with multiple gene copies using the operation types double-cut-and-joins, duplications and deletions. The events supported by the model are reversals, translocations, tandem duplications, segmental deletions and chromosomal amplifications and deletions, covering most types of structural and numerical changes observed in tumor samples. Our goal is to find a series of operations of minimum length that transform one karyotype into the other. We show that the problem is NP-hard and give an integer linear programming formulation that solves the problem exactly under some mild assumptions. We test our method on simulated genomes and on ovarian cancer genomes. Our study advances the state of the art in two ways: It allows a broader set of operations than extant models, thus being more realistic and it is the first study attempting to re-construct the full sequence of structural and numerical events during cancer evolution. AVAILABILITY AND IMPLEMENTATION: Code and data are available in https://github.com/Shamir-Lab/Sorting-Cancer-Karyotypes. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

10.

DOMINO: a network-based active module identification algorithm with reduced rate of false calls.

Levi, Hagai; Elkon, Ran; Shamir, Ron.

Mol Syst Biol ; 17(1): e9593, 2021 01.

Artículo en Inglés | MEDLINE | ID: mdl-33471440

RESUMEN

Algorithms for active module identification (AMI) are central to analysis of omics data. Such algorithms receive a gene network and nodes' activity scores as input and report subnetworks that show significant over-representation of accrued activity signal ("active modules"), thus representing biological processes that presumably play key roles in the analyzed conditions. Here, we systematically evaluated six popular AMI methods on gene expression and GWAS data. We observed that GO terms enriched in modules detected on the real data were often also enriched on modules found on randomly permuted data. This indicated that AMI methods frequently report modules that are not specific to the biological context measured by the analyzed omics dataset. To tackle this bias, we designed a permutation-based method that empirically evaluates GO terms reported by AMI methods. We used the method to fashion five novel AMI performance criteria. Last, we developed DOMINO, a novel AMI algorithm, that outperformed the other six algorithms in extensive testing on GE and GWAS data. Software is available at https://github.com/Shamir-Lab.

Asunto(s)

Biología Computacional/métodos , Redes Reguladoras de Genes , Algoritmos , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Anotación de Secuencia Molecular , Programas Informáticos

11.

Drosophila TRF2 is a preferential core promoter regulator.

Kedmi, Adi; Zehavi, Yonathan; Glick, Yair; Orenstein, Yaron; Ideses, Diana; Wachtel, Chaim; Doniger, Tirza; Waldman Ben-Asher, Hiba; Muster, Nemone; Thompson, James; Anderson, Scott; Avrahami, Dorit; Yates, John R; Shamir, Ron; Gerber, Doron; Juven-Gershon, Tamar.

Genes Dev ; 28(19): 2163-74, 2014 Oct 01.

Artículo en Inglés | MEDLINE | ID: mdl-25223897

RESUMEN

Transcription of protein-coding genes is highly dependent on the RNA polymerase II core promoter. Core promoters, generally defined as the regions that direct transcription initiation, consist of functional core promoter motifs (such as the TATA-box, initiator [Inr], and downstream core promoter element [DPE]) that confer specific properties to the core promoter. The known basal transcription factors that support TATA-dependent transcription are insufficient for in vitro transcription of DPE-dependent promoters. In search of a transcription factor that supports DPE-dependent transcription, we used a biochemical complementation approach and identified the Drosophila TBP (TATA-box-binding protein)-related factor 2 (TRF2) as an enriched factor in the fractions that support DPE-dependent transcription. We demonstrate that the short TRF2 isoform preferentially activates DPE-dependent promoters. DNA microarray analysis reveals the enrichment of DPE promoters among short TRF2 up-regulated genes. Using primer extension analysis and reporter assays, we show the importance of the DPE in transcriptional regulation of TRF2 target genes. It was previously shown that, unlike TBP, TRF2 fails to bind DNA containing TATA-boxes. Using microfluidic affinity analysis, we discovered that short TRF2-bound DNA oligos are enriched for Inr and DPE motifs. Taken together, our findings highlight the role of short TRF2 as a preferential core promoter regulator.

Asunto(s)

Proteínas de Drosophila/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Regulación de la Expresión Génica , Proteína 2 de Unión a Repeticiones Teloméricas/metabolismo , Secuencias de Aminoácidos , Animales , Línea Celular , Células Cultivadas , Proteínas de Drosophila/genética , Unión Proteica , TATA Box , Proteína 2 de Unión a Repeticiones Teloméricas/genética

12.

PRODIGY: personalized prioritization of driver genes.

Dinstag, Gal; Shamir, Ron.

Bioinformatics ; 36(6): 1831-1839, 2020 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-31681944

RESUMEN

MOTIVATION: Evolution of cancer is driven by few somatic mutations that disrupt cellular processes, causing abnormal proliferation and tumor development, whereas most somatic mutations have no impact on progression. Distinguishing those mutated genes that drive tumorigenesis in a patient is a primary goal in cancer therapy: Knowledge of these genes and the pathways on which they operate can illuminate disease mechanisms and indicate potential therapies and drug targets. Current research focuses mainly on cohort-level driver gene identification but patient-specific driver gene identification remains a challenge. METHODS: We developed a new algorithm for patient-specific ranking of driver genes. The algorithm, called PRODIGY, analyzes the expression and mutation profiles of the patient along with data on known pathways and protein-protein interactions. Prodigy quantifies the impact of each mutated gene on every deregulated pathway using the prize-collecting Steiner tree model. Mutated genes are ranked by their aggregated impact on all deregulated pathways. RESULTS: In testing on five TCGA cancer cohorts spanning >2500 patients and comparison to validated driver genes, Prodigy outperformed extant methods and ranking based on network centrality measures. Our results pinpoint the pleiotropic effect of driver genes and show that Prodigy is capable of identifying even very rare drivers. Hence, Prodigy takes a step further toward personalized medicine and treatment. AVAILABILITY AND IMPLEMENTATION: The Prodigy R package is available at: https://github.com/Shamir-Lab/PRODIGY. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Genómica , Neoplasias , Algoritmos , Humanos , Mutación , Medicina de Precisión

13.

MONET: Multi-omic module discovery by omic selection.

Rappoport, Nimrod; Safra, Roy; Shamir, Ron.

PLoS Comput Biol ; 16(9): e1008182, 2020 09.

Artículo en Inglés | MEDLINE | ID: mdl-32931516

RESUMEN

Recent advances in experimental biology allow creation of datasets where several genome-wide data types (called omics) are measured per sample. Integrative analysis of multi-omic datasets in general, and clustering of samples in such datasets specifically, can improve our understanding of biological processes and discover different disease subtypes. In this work we present MONET (Multi Omic clustering by Non-Exhaustive Types), which presents a unique approach to multi-omic clustering. MONET discovers modules of similar samples, such that each module is allowed to have a clustering structure for only a subset of the omics. This approach differs from most existent multi-omic clustering algorithms, which assume a common structure across all omics, and from several recent algorithms that model distinct cluster structures. We tested MONET extensively on simulated data, on an image dataset, and on ten multi-omic cancer datasets from TCGA. Our analysis shows that MONET compares favorably with other multi-omic clustering methods. We demonstrate MONET's biological and clinical relevance by analyzing its results for Ovarian Serous Cystadenocarcinoma. We also show that MONET is robust to missing data, can cluster genes in multi-omic dataset, and reveal modules of cell types in single-cell multi-omic data. Our work shows that MONET is a valuable tool that can provide complementary results to those provided by existent algorithms for multi-omic analysis.

Asunto(s)

Algoritmos , Genómica/métodos , Análisis por Conglomerados , Bases de Datos Genéticas , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Análisis de la Célula Individual

14.

PlasClass improves plasmid sequence classification.

Pellow, David; Mizrahi, Itzik; Shamir, Ron.

PLoS Comput Biol ; 16(4): e1007781, 2020 04.

Artículo en Inglés | MEDLINE | ID: mdl-32243433

RESUMEN

Many bacteria contain plasmids, but separating between contigs that originate on the plasmid and those that are part of the bacterial genome can be difficult. This is especially true in metagenomic assembly, which yields many contigs of unknown origin. Existing tools for classifying sequences of plasmid origin give less reliable results for shorter sequences, are trained using a fraction of the known plasmids, and can be difficult to use in practice. We present PlasClass, a new plasmid classifier. It uses a set of standard classifiers trained on the most current set of known plasmid sequences for different sequence lengths. We tested PlasClass sequence classification on held-out data and simulations, as well as publicly available bacterial isolates and plasmidome samples and plasmids assembled from metagenomic samples. PlasClass outperforms the state-of-the-art plasmid classification tool on shorter sequences, which constitute the majority of assembly contigs, allowing it to achieve higher F1 scores in classifying sequences from a wide range of datasets. PlasClass also uses significantly less time and memory. PlasClass can be used to easily classify plasmid and bacterial genome sequences in metagenomic or isolate assemblies. It is available under the MIT license from: https://github.com/Shamir-Lab/PlasClass.

Asunto(s)

ADN , Plásmidos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Biología Computacional/métodos , ADN/clasificación , ADN/genética , ADN Bacteriano/clasificación , ADN Bacteriano/genética , Genoma Bacteriano/genética , Plásmidos/clasificación , Plásmidos/genética

15.

Unravelling plasmidome distribution and interaction with its hosting microbiome.

Brown Kav, Aya; Rozov, Roye; Bogumil, David; Sørensen, Søren Johannes; Hansen, Lars Hestbjerg; Benhar, Itai; Halperin, Eran; Shamir, Ron; Mizrahi, Itzhak.

Environ Microbiol ; 22(1): 32-44, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-31602783

RESUMEN

Horizontal gene transfer via plasmids plays a pivotal role in microbial evolution. The forces that shape plasmidomes functionality and distribution in natural environments are insufficiently understood. Here, we present a comparative study of plasmidomes across adjacent microbial environments present in different individual rumen microbiomes. Our findings show that the rumen plasmidome displays enormous unknown functional potential currently unannotated in available databases. Nevertheless, this unknown functionality is conserved and shared with published rat gut plasmidome data. Moreover, the rumen plasmidome is highly diverse compared with the microbiome that hosts these plasmids, across both similar and different rumen habitats. Our analysis demonstrates that its structure is shaped more by stochasticity than selection. Nevertheless, the plasmidome is an active partner in its intricate relationship with the host microbiome with both interacting with and responding to their environment.

Asunto(s)

Bacterias/genética , Microbiota/genética , Plásmidos/genética , Rumen/microbiología , Animales , Transferencia de Gen Horizontal

16.

NEMO: cancer subtyping by integration of partial multi-omic data.

Rappoport, Nimrod; Shamir, Ron.

Bioinformatics ; 35(18): 3348-3356, 2019 09 15.

Artículo en Inglés | MEDLINE | ID: mdl-30698637

RESUMEN

MOTIVATION: Cancer subtypes were usually defined based on molecular characterization of single omic data. Increasingly, measurements of multiple omic profiles for the same cohort are available. Defining cancer subtypes using multi-omic data may improve our understanding of cancer, and suggest more precise treatment for patients. RESULTS: We present NEMO (NEighborhood based Multi-Omics clustering), a novel algorithm for multi-omics clustering. Importantly, NEMO can be applied to partial datasets in which some patients have data for only a subset of the omics, without performing data imputation. In extensive testing on ten cancer datasets spanning 3168 patients, NEMO achieved results comparable to the best of nine state-of-the-art multi-omics clustering algorithms on full data and showed an improvement on partial data. On some of the partial data tests, PVC, a multi-view algorithm, performed better, but it is limited to two omics and to positive partial data. Finally, we demonstrate the advantage of NEMO in detailed analysis of partial data of AML patients. NEMO is fast and much simpler than existing multi-omics clustering algorithms, and avoids iterative optimization. AVAILABILITY AND IMPLEMENTATION: Code for NEMO and for reproducing all NEMO results in this paper is in github: https://github.com/Shamir-Lab/NEMO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Neoplasias , Algoritmos , Análisis por Conglomerados , Humanos , Programas Informáticos

17.

Inaccuracy of the log-rank approximation in cancer data analysis.

Rappoport, Nimrod; Shamir, Ron.

Mol Syst Biol ; 15(8): e8754, 2019 08.

Artículo en Inglés | MEDLINE | ID: mdl-31464374

RESUMEN

The log-rank test statistic is very broadly used in biology. Unfortunately, P-values based on the popular chi-square approximation are often inaccurate and can be misleading.

Asunto(s)

Neoplasias de la Mama/mortalidad , Estadística como Asunto , Animales , Antineoplásicos/farmacología , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/patología , Análisis por Conglomerados , Conjuntos de Datos como Asunto , Modelos Animales de Enfermedad , Femenino , Humanos , Programas Informáticos , Análisis de Supervivencia

18.

Using the kinetics of C-reactive protein response to improve the differential diagnosis between acute bacterial and viral infections.

Coster, Dan; Wasserman, Asaf; Fisher, Eyal; Rogowski, Ori; Zeltser, David; Shapira, Itzhak; Bernstein, Daniel; Meilik, Ahuva; Raykhshtat, Eli; Halpern, Pinchas; Berliner, Shlomo; Shenhar-Tsarfaty, Shani; Shamir, Ron.

Infection ; 48(2): 241-248, 2020 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-31873850

RESUMEN

PURPOSE: Differential diagnosis between acute viral and bacterial infection is an emerging common challenge for a physician in the emergency department. Serum C-reactive protein (CRP) is used to support diagnosis of bacterial infection, but in patients admitted with low CRP, its ability to discriminate between viral and bacterial infections is limited. We aimed to use two consecutive CRP measurements in order to improve differential diagnosis between bacterial and viral infection. METHODS: A single-center retrospective cohort (n = 1629) study of adult patients admitted to the emergency department with a subsequent microbiological confirmation of either viral or bacterial infection. Trend of CRP was defined as the absolute difference between the first two measurements of CRP divided by the time between them, and we investigated the ability of this parameter to differentiate between viral and bacterial infection. RESULTS: In patients with relatively low initial CRP concentration (< 60 mg/L, n = 634 patients), where the uncertainty regarding the type of infection is the highest, the trend improved diagnosis accuracy (AUC 0.83 compared to 0.57 for the first CRP measurement). Trend values above 3.47 mg/L/h discriminated bacterial from viral infection with 93.8% specificity and 50% sensitivity. CONCLUSIONS: The proposed approach for using the kinetics of CRP in patients whose first CRP measurement is low can assist in differential diagnosis between acute bacterial and viral infection.

Asunto(s)

Proteínas Adaptadoras Transductoras de Señales/metabolismo , Infecciones Bacterianas/diagnóstico , Proteínas Portadoras/metabolismo , Proteínas con Dominio LIM/metabolismo , Virosis/diagnóstico , Enfermedad Aguda , Proteínas Adaptadoras Transductoras de Señales/sangre , Adulto , Anciano , Anciano de 80 o más Años , Área Bajo la Curva , Infecciones Bacterianas/sangre , Proteínas Portadoras/sangre , Estudios de Cohortes , Diagnóstico Diferencial , Femenino , Humanos , Proteínas con Dominio LIM/sangre , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Estudios Retrospectivos , Sensibilidad y Especificidad , Virosis/sangre

19.

Modelling and analysis of gene regulatory networks.

Karlebach, Guy; Shamir, Ron.

Nat Rev Mol Cell Biol ; 9(10): 770-80, 2008 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-18797474

RESUMEN

Gene regulatory networks have an important role in every process of life, including cell differentiation, metabolism, the cell cycle and signal transduction. By understanding the dynamics of these networks we can shed light on the mechanisms of diseases that occur when these cellular processes are dysregulated. Accurate prediction of the behaviour of regulatory networks will also speed up biotechnological projects, as such predictions are quicker and cheaper than lab experiments. Computational methods, both for supporting the development of network models and for the analysis of their functionality, have already proved to be a valuable research tool.

Asunto(s)

Redes Reguladoras de Genes , Modelos Genéticos , Algoritmos , Animales , Bacteriófago lambda/genética , Bacteriófago lambda/fisiología , Humanos , Modelos Lineales , Matemática , Modelos Biológicos , Modelos Estadísticos , Procesos Estocásticos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo

20.

Multi-omic and multi-view clustering algorithms: review and cancer benchmark.

Rappoport, Nimrod; Shamir, Ron.

Nucleic Acids Res ; 46(20): 10546-10562, 2018 11 16.

Artículo en Inglés | MEDLINE | ID: mdl-30295871

RESUMEN

Recent high throughput experimental methods have been used to collect large biomedical omics datasets. Clustering of single omic datasets has proven invaluable for biological and medical research. The decreasing cost and development of additional high throughput methods now enable measurement of multi-omic data. Clustering multi-omic data has the potential to reveal further systems-level insights, but raises computational and biological challenges. Here, we review algorithms for multi-omics clustering, and discuss key issues in applying these algorithms. Our review covers methods developed specifically for omic data as well as generic multi-view methods developed in the machine learning community for joint clustering of multiple data types. In addition, using cancer data from TCGA, we perform an extensive benchmark spanning ten different cancer types, providing the first systematic comparison of leading multi-omics and multi-view clustering algorithms. The results highlight key issues regarding the use of single- versus multi-omics, the choice of clustering strategy, the power of generic multi-view methods and the use of approximated p-values for gauging solution quality. Due to the growing use of multi-omics data, we expect these issues to be important for future progress in the field.

Asunto(s)

Biología Computacional/métodos , Genómica/métodos , Neoplasias/genética , Proteómica/métodos , Algoritmos , Teorema de Bayes , Análisis por Conglomerados , Bases de Datos Factuales , Humanos , Aprendizaje Automático , Modelos Estadísticos , Neoplasias/mortalidad , Probabilidad , Pronóstico

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA