Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
J Neurooncol ; 2024 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-38985431

RESUMEN

PURPOSE: Brain metastases represent the most common intracranial tumors in adults and are associated with a poor prognosis. We used a personalized in vitro drug screening approach to characterize individual therapeutic vulnerabilities in brain metastases. METHODS: Short-term cultures of cancer cells isolated from brain metastasis patients were molecularly characterized using next-generation sequencing and functionally evaluated using high-throughput in vitro drug screening to characterize pharmacological treatment sensitivities. RESULTS: Next-generation sequencing identified matched genetic alterations in brain metastasis tissue samples and corresponding short-term cultures, suggesting that short-term cultures of brain metastases are suitable models for recapitulating the genetic profile of brain metastases that may determine their sensitivity to anti-cancer drugs. Employing a high-throughput in vitro drug screening platform, we successfully screened the cultures of five brain metastases for response to 267 anticancer compounds and related drug response to genetic data. Among others, we found that targeted treatment with JAK3, HER2, or FGFR3 inhibitors showed anti-cancer effects in individual brain metastasis cultures. CONCLUSION: Our preclinical study provides a proof-of-concept for combining molecular profiling with in vitro drug screening for predictive evaluation of therapeutic vulnerabilities in brain metastasis patients. This approach could advance the use of patient-derived cancer cells in clinical practice and might eventually facilitate decision-making for personalized drug treatment.

2.
J Comput Aided Mol Des ; 37(8): 357-371, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37310542

RESUMEN

An Online tool for Fragment-based Molecule Parametrization (OFraMP) is described. OFraMP is a web application for assigning atomic interaction parameters to large molecules by matching sub-fragments within the target molecule to equivalent sub-fragments within the Automated Topology Builder (ATB, atb.uq.edu.au) database. OFraMP identifies and compares alternative molecular fragments from the ATB database, which contains over 890,000 pre-parameterized molecules, using a novel hierarchical matching procedure. Atoms are considered within the context of an extended local environment (buffer region) with the degree of similarity between an atom in the target molecule and that in the proposed match controlled by varying the size of the buffer region. Adjacent matching atoms are combined into progressively larger matched sub-structures. The user then selects the most appropriate match. OFraMP also allows users to manually alter interaction parameters and automates the submission of missing substructures to the ATB in order to generate parameters for atoms in environments not represented in the existing database. The utility of OFraMP is illustrated using the anti-cancer agent paclitaxel and a dendrimer used in organic semiconductor devices. OFraMP applied to paclitaxel (ATB ID 35922).


Asunto(s)
Programas Informáticos , Bases de Datos Factuales
3.
Orig Life Evol Biosph ; 52(4): 263-275, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36383289

RESUMEN

Protein coordinated iron-sulfur clusters drive electron flow within metabolic pathways for organisms throughout the tree of life. It is not known how iron-sulfur clusters were first incorporated into proteins. Structural analogies to iron-sulfide minerals present on early Earth, suggest a connection in the evolution of both proteins and minerals. The availability of large protein and mineral crystallographic structure data sets, provides an opportunity to explore co-evolution of proteins and minerals on a large-scale using informatics approaches. However, quantitative comparisons are confounded by the infinite, repeating nature of the mineral lattice, in contrast to metal clusters in proteins, which are finite in size. We address this problem using the Niggli reduction to transform a mineral lattice to a finite, unique structure that when translated reproduces the crystal lattice. Protein and reduced mineral structures were represented as quotient graphs with the edges and nodes corresponding to bonds and atoms, respectively. We developed a graph theory-based method to calculate the maximum common connected edge subgraph (MCCES) between mineral and protein quotient graphs. MCCES can accommodate differences in structural volumes and easily allows additional chemical criteria to be considered when calculating similarity. To account for graph size differences, we use the Tversky similarity index. Using consistent criteria, we found little similarity between putative ancient iron-sulfur protein clusters and iron-sulfur mineral lattices, suggesting these metal sites are not as evolutionarily connected as once thought. We discuss possible evolutionary implications of these findings in addition to suggesting an alternative proxy, mineral surfaces, for better understanding the coevolution of the geosphere and biosphere.


Asunto(s)
Proteínas Hierro-Azufre , Metaloproteínas , Minerales , Proteínas Hierro-Azufre/química , Proteínas Hierro-Azufre/metabolismo , Azufre/química , Azufre/metabolismo , Hierro/química
4.
Bioinformatics ; 32(11): 1610-7, 2016 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-26315913

RESUMEN

MOTIVATION: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent of 'future-generation' sequencing technologies and their capability to produce long reads at increasing coverage. Existing methods are not able to deal with such data in a fully satisfactory way, either because accuracy or performances degrade as read length and sequencing coverage increase or because they are based on restrictive assumptions. RESULTS: By exploiting a feature of future-generation technologies-the uniform distribution of sequencing errors-we designed an exact algorithm, called HapCol, that is exponential in the maximum number of corrections for each single-nucleotide polymorphism position and that minimizes the overall error-correction score. We performed an experimental analysis, comparing HapCol with the current state-of-the-art combinatorial methods both on real and simulated data. On a standard benchmark of real data, we show that HapCol is competitive with state-of-the-art methods, improving the accuracy and the number of phased positions. Furthermore, experiments on realistically simulated datasets revealed that HapCol requires significantly less computing resources, especially memory. Thanks to its computational efficiency, HapCol can overcome the limits of previous approaches, allowing to phase datasets with higher coverage and without the traditional all-heterozygous assumption. AVAILABILITY AND IMPLEMENTATION: Our source code is available under the terms of the GNU General Public License at http://hapcol.algolab.eu/ CONTACT: bonizzoni@disco.unimib.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Haplotipos , Algoritmos , Diploidia , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , Programas Informáticos
5.
Bioinformatics ; 32(11): 1678-85, 2016 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-26342232

RESUMEN

MOTIVATION: The human microbiome plays a key role in health and disease. Thanks to comparative metatranscriptomics, the cellular functions that are deregulated by the microbiome in disease can now be computationally explored. Unlike gene-centric approaches, pathway-based methods provide a systemic view of such functions; however, they typically consider each pathway in isolation and in its entirety. They can therefore overlook the key differences that (i) span multiple pathways, (ii) contain bidirectionally deregulated components, (iii) are confined to a pathway region. To capture these properties, computational methods that reach beyond the scope of predefined pathways are needed. RESULTS: By integrating an existing module discovery algorithm into comparative metatranscriptomic analysis, we developed metaModules, a novel computational framework for automated identification of the key functional differences between health- and disease-associated communities. Using this framework, we recovered significantly deregulated subnetworks that were indeed recognized to be involved in two well-studied, microbiome-mediated oral diseases, such as butanoate production in periodontal disease and metabolism of sugar alcohols in dental caries. More importantly, our results indicate that our method can be used for hypothesis generation based on automated discovery of novel, disease-related functional subnetworks, which would otherwise require extensive and laborious manual assessment. AVAILABILITY AND IMPLEMENTATION: metaModules is available at https://bitbucket.org/alimay/metamodules/ CONTACT: a.may@vu.nl or s.abeln@vu.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Microbiota , Algoritmos , Caries Dental , Humanos
6.
Appl Environ Microbiol ; 83(21)2017 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-28842544

RESUMEN

Whooping cough is a highly contagious respiratory disease caused by Bordetella pertussis Despite widespread vaccination, its incidence has been rising alarmingly, and yet, the physiology of B. pertussis remains poorly understood. We combined genome-scale metabolic reconstruction, a novel optimization algorithm, and experimental data to probe the full metabolic potential of this pathogen, using B. pertussis strain Tohama I as a reference. Experimental validation showed that B. pertussis secretes a significant proportion of nitrogen as arginine and purine nucleosides, which may contribute to modulation of the host response. We also found that B. pertussis can be unexpectedly versatile, being able to metabolize many compounds while displaying minimal nutrient requirements. It can grow without cysteine, using inorganic sulfur sources, such as thiosulfate, and it can grow on organic acids, such as citrate or lactate, as sole carbon sources, providing in vivo demonstration that its tricarboxylic acid (TCA) cycle is functional. Although the metabolic reconstruction of eight additional strains indicates that the structural genes underlying this metabolic flexibility are widespread, experimental validation suggests a role of strain-specific regulatory mechanisms in shaping metabolic capabilities. Among five alternative strains tested, three strains were shown to grow on substrate combinations requiring a functional TCA cycle, but only one strain could use thiosulfate. Finally, the metabolic model was used to rationally design growth media with >2-fold improvements in pertussis toxin production. This study thus provides novel insights into B. pertussis physiology and highlights the potential, but also the limitations, of models based solely on metabolic gene content.IMPORTANCE The metabolic capabilities of Bordetella pertussis, the causative agent of whooping cough, were investigated from a systems-level perspective. We constructed a comprehensive genome-scale metabolic model for B. pertussis and challenged its predictions experimentally. This systems approach shed light on new potential host-microbe interactions and allowed us to rationally design novel growth media with >2-fold improvements in pertussis toxin production. Most importantly, we also uncovered the potential for metabolic flexibility of B. pertussis (significantly larger range of substrates than previously alleged; novel active pathways allowing growth in minimal, nearly mineral nutrient combinations where only the carbon source must be organic), although our results also highlight the importance of strain-specific regulatory determinants in shaping metabolic capabilities. Deciphering the underlying regulatory mechanisms appears to be crucial for a comprehensive understanding of B. pertussis's lifestyle and the epidemiology of whooping cough. The contribution of metabolic models in this context will require the extension of the genome-scale metabolic model to integrate this regulatory dimension.

7.
Bioinformatics ; 31(19): 3147-55, 2015 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26023104

RESUMEN

MOTIVATION: Integrative network analysis methods provide robust interpretations of differential high-throughput molecular profile measurements. They are often used in a biomedical context-to generate novel hypotheses about the underlying cellular processes or to derive biomarkers for classification and subtyping. The underlying molecular profiles are frequently measured and validated on animal or cellular models. Therefore the results are not immediately transferable to human. In particular, this is also the case in a study of the recently discovered interleukin-17 producing helper T cells (Th17), which are fundamental for anti-microbial immunity but also known to contribute to autoimmune diseases. RESULTS: We propose a mathematical model for finding active subnetwork modules that are conserved between two species. These are sets of genes, one for each species, which (i) induce a connected subnetwork in a species-specific interaction network, (ii) show overall differential behavior and (iii) contain a large number of orthologous genes. We propose a flexible notion of conservation, which turns out to be crucial for the quality of the resulting modules in terms of biological interpretability. We propose an algorithm that finds provably optimal or near-optimal conserved active modules in our model. We apply our algorithm to understand the mechanisms underlying Th17 T cell differentiation in both mouse and human. As a main biological result, we find that the key regulation of Th17 differentiation is conserved between human and mouse. AVAILABILITY AND IMPLEMENTATION: xHeinz, an implementation of our algorithm, as well as all input data and results, are available at http://software.cwi.nl/xheinz and as a Galaxy service at http://services.cbib.u-bordeaux2.fr/galaxy in CBiB Tools. CONTACT: gunnar.klau@cwi.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Minería de Datos , Modelos Biológicos , Mapas de Interacción de Proteínas , Animales , Perfilación de la Expresión Génica , Humanos , Recién Nacido , Ratones , Especificidad de la Especie , Células Th17/citología
8.
Mol Cell Proteomics ; 13(7): 1877-89, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24807868

RESUMEN

The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system.


Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas , Proteínas/metabolismo , Linfocitos T/metabolismo , Algoritmos , Células Cultivadas , Simulación por Computador , Humanos , Subtipo H9N2 del Virus de la Influenza A , Proteoma/metabolismo , Proteómica , Proteínas Proto-Oncogénicas c-raf/metabolismo , Estómago/citología , Estómago/virología , Biología de Sistemas
9.
BMC Bioinformatics ; 15: 201, 2014 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-25002203

RESUMEN

BACKGROUND: Biological networks have a growing importance for the interpretation of high-throughput "omics" data. Integrative network analysis makes use of statistical and combinatorial methods to extract smaller subnetwork modules, and performs enrichment analysis to annotate the modules with ontology terms or other available knowledge. This process results in an annotated module, which retains the original network structure and includes enrichment information as a set system. A major bottleneck is a lack of tools that allow exploring both network structure of extracted modules and its annotations. RESULTS: This paper presents a visual analysis approach that targets small modules with many set-based annotations, and which displays the annotations as contours on top of a node-link diagram. We introduce an extension of self-organizing maps to lay out nodes, links, and contours in a unified way. An implementation of this approach is freely available as the Cytoscape app eXamine CONCLUSIONS: eXamine accurately conveys small and annotated modules consisting of several dozens of proteins and annotations. We demonstrate that eXamine facilitates the interpretation of integrative network analysis results in a guided case study. This study has resulted in a novel biological insight regarding the virally-encoded G-protein coupled receptor US28.


Asunto(s)
Proteínas/análisis , Algoritmos , Análisis por Conglomerados , Modelos Biológicos , Proteínas/metabolismo , Programas Informáticos
10.
Nucleic Acids Res ; 40(Web Server issue): W303-9, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22553365

RESUMEN

CSA is a web server for the computation, evaluation and comprehensive comparison of pairwise protein structure alignments. Its exact alignment engine computes either optimal, top-scoring alignments or heuristic alignments with quality guarantee for the inter-residue distance-based scorings of contact map overlap, PAUL, DALI and MATRAS. These and additional, uploaded alignments are compared using a number of quality measures and intuitive visualizations. CSA brings new insight into the structural relationship of the protein pairs under investigation and is a valuable tool for studying structural similarities. It is available at http://csa.project.cwi.nl.


Asunto(s)
Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Calmodulina/química , Internet
11.
Nucleic Acids Res ; 40(6): e43, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22210863

RESUMEN

Deregulation of cell signaling pathways plays a crucial role in the development of tumors. The identification of such pathways requires effective analysis tools that facilitate the interpretation of expression differences. Here, we present a novel and highly efficient method for identifying deregulated subnetworks in a regulatory network. Given a score for each node that measures the degree of deregulation of the corresponding gene or protein, the algorithm computes the heaviest connected subnetwork of a specified size reachable from a designated root node. This root node can be interpreted as a molecular key player responsible for the observed deregulation. To demonstrate the potential of our approach, we analyzed three gene expression data sets. In one scenario, we compared expression profiles of non-malignant primary mammary epithelial cells derived from BRCA1 mutation carriers and of epithelial cells without BRCA1 mutation. Our results suggest that oxidative stress plays an important role in epithelial cells of BRCA1 mutation carriers and that the activation of stress proteins may result in avoidance of apoptosis leading to an increased overall survival of cells with genetic alterations. In summary, our approach opens new avenues for the elucidation of pathogenic mechanisms and for the detection of molecular key players.


Asunto(s)
Algoritmos , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Programación Lineal , Adenocarcinoma/genética , Adenocarcinoma/metabolismo , Mama/citología , Mama/metabolismo , Línea Celular Tumoral , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/metabolismo , Células Epiteliales/metabolismo , Femenino , Perfilación de la Expresión Génica , Genes BRCA1 , Glioma/genética , Glioma/metabolismo , Humanos , Mutación , Mapas de Interacción de Proteínas , Transducción de Señal
12.
Genome Biol ; 25(1): 26, 2024 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-38243222

RESUMEN

Potato is one of the world's major staple crops, and like many important crop plants, it has a polyploid genome. Polyploid haplotype assembly poses a major computational challenge. We introduce a novel strategy for the assembly of polyploid genomes and present an assembly of the autotetraploid potato cultivar Altus. Our method uses low-depth sequencing data from an offspring population to achieve chromosomal clustering and haplotype phasing on the assembly graph. Our approach generates high-quality assemblies of individual chromosomes with haplotype-specific sequence resolution of whole chromosome arms and can be applied in common breeding scenarios where collections of offspring are available.


Asunto(s)
Solanum tuberosum , Tetraploidía , Humanos , Haplotipos , Análisis de Secuencia de ADN , Solanum tuberosum/genética , Fitomejoramiento , Poliploidía
13.
Sci Rep ; 14(1): 4068, 2024 02 19.
Artículo en Inglés | MEDLINE | ID: mdl-38374282

RESUMEN

The gut microbiome is a diverse ecosystem, dominated by bacteria; however, fungi, phages/viruses, archaea, and protozoa are also important members of the gut microbiota. Exploration of taxonomic compositions beyond bacteria as well as an understanding of the interaction between the bacteriome with the other members is limited using 16S rDNA sequencing. Here, we developed a pipeline enabling the simultaneous interrogation of the gut microbiome (bacteriome, mycobiome, archaeome, eukaryome, DNA virome) and of antibiotic resistance genes based on optimized long-read shotgun metagenomics protocols and custom bioinformatics. Using our pipeline we investigated the longitudinal composition of the gut microbiome in an exploratory clinical study in patients undergoing allogeneic hematopoietic stem cell transplantation (alloHSCT; n = 31). Pre-transplantation microbiomes exhibited a 3-cluster structure, characterized by Bacteroides spp. /Phocaeicola spp., mixed composition and Enterococcus abundances. We revealed substantial inter-individual and temporal variabilities of microbial domain compositions, human DNA, and antibiotic resistance genes during the course of alloHSCT. Interestingly, viruses and fungi accounted for substantial proportions of microbiome content in individual samples. In the course of HSCT, bacterial strains were stable or newly acquired. Our results demonstrate the disruptive potential of alloHSCTon the gut microbiome and pave the way for future comprehensive microbiome studies based on long-read metagenomics.


Asunto(s)
Microbioma Gastrointestinal , Trasplante de Células Madre Hematopoyéticas , Microbiota , Humanos , Microbioma Gastrointestinal/genética , Microbiota/genética , Bacterias/genética , Antibacterianos , Hongos/genética , ADN Ribosómico , Metagenómica/métodos
14.
BMC Bioinformatics ; 14 Suppl 15: S18, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24564758

RESUMEN

BACKGROUND: We study the problem of mapping proteins between two protein families in the presence of paralogs. This problem occurs as a difficult subproblem in coevolution-based computational approaches for protein-protein interaction prediction. RESULTS: Similar to prior approaches, our method is based on the idea that coevolution implies equal rates of sequence evolution among the interacting proteins, and we provide a first attempt to quantify this notion in a formal statistical manner. We call the units that are central to this quantification scheme the units of coevolution. A unit consists of two mapped protein pairs and its score quantifies the coevolution of the pairs. This quantification allows us to provide a maximum likelihood formulation of the paralog mapping problem and to cast it into a binary quadratic programming formulation. CONCLUSION: CUPID, our software tool based on a Lagrangian relaxation of this formulation, makes it, for the first time, possible to compute state-of-the-art quality pairings in a few minutes of runtime. In summary, we suggest a novel alternative to the earlier available approaches, which is statistically sound and computationally feasible.


Asunto(s)
Proteínas/análisis , Programas Informáticos , Secuencia de Aminoácidos , Datos de Secuencia Molecular , Proteínas/química , Alineación de Secuencia , Análisis de Secuencia de Proteína
15.
J Mol Evol ; 77(4): 170-84, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23877342

RESUMEN

The genetic code has a high level of error robustness. Using values of hydrophobicity scales as a proxy for amino acid character, and the mean square measure as a function quantifying error robustness, a value can be obtained for a genetic code which reflects the error robustness of that code. By comparing this value with a distribution of values belonging to codes generated by random permutations of amino acid assignments, the level of error robustness of a genetic code can be quantified. We present a calculation in which the standard genetic code is shown to be optimal. We obtain this result by (1) using recently updated values of polar requirement as input; (2) fixing seven assignments (Ile, Trp, His, Phe, Tyr, Arg, and Leu) based on aptamer considerations; and (3) using known biosynthetic relations of the 20 amino acids. This last point is reflected in an approach of subdivision (restricting the random reallocation of assignments to amino acid subgroups, the set of 20 being divided in four such subgroups). The three approaches to explain robustness of the code (specific selection for robustness, amino acid-RNA interactions leading to assignments, or a slow growth process of assignment patterns) are reexamined in light of our findings. We offer a comprehensive hypothesis, stressing the importance of biosynthetic relations, with the code evolving from an early stage with just glycine and alanine, via intermediate stages, towards 64 codons carrying todays meaning.


Asunto(s)
Aminoácidos/química , Aminoácidos/genética , Código Genético , Modelos Genéticos , Aptámeros de Péptidos/química , Aptámeros de Péptidos/genética , Codón , Evolución Molecular , Biosíntesis de Proteínas/genética , Biosíntesis de Proteínas/fisiología
16.
Bioinformatics ; 28(14): 1887-94, 2012 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-22581175

RESUMEN

MOTIVATION: High-throughput molecular data provide a wealth of information that can be integrated into network analysis. Several approaches exist that identify functional modules in the context of integrated biological networks. The objective of this study is 2-fold: first, to assess the accuracy and variability of identified modules and second, to develop an algorithm for deriving highly robust and accurate solutions. RESULTS: In a comparative simulation study accuracy and robustness of the proposed and established methodologies are validated, considering various sources of variation in the data. To assess this variation, we propose a jackknife resampling procedure resulting in an ensemble of optimal modules. A consensus approach summarizes the ensemble into one final module containing maximally robust nodes and edges. The resulting consensus module identifies and visualizes robust and variable regions by assigning support values to nodes and edges. Finally, the proposed approach is exemplified on two large gene expression studies: diffuse large B-cell lymphoma and acute lymphoblastic leukemia.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Simulación por Computador , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Linfoma de Células B Grandes Difuso/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética
17.
Bioinformatics ; 28(22): 2875-82, 2012 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-23060616

RESUMEN

MOTIVATION: Next-generation sequencing techniques have facilitated a large-scale analysis of human genetic variation. Despite the advances in sequencing speed, the computational discovery of structural variants is not yet standard. It is likely that many variants have remained undiscovered in most sequenced individuals. RESULTS: Here, we present a novel internal segment size based approach, which organizes all, including concordant, reads into a read alignment graph, where max-cliques represent maximal contradiction-free groups of alignments. A novel algorithm then enumerates all max-cliques and statistically evaluates them for their potential to reflect insertions or deletions. For the first time in the literature, we compare a large range of state-of-the-art approaches using simulated Illumina reads from a fully annotated genome and present relevant performance statistics. We achieve superior performance, in particular, for deletions or insertions (indels) of length 20-100 nt. This has been previously identified as a remaining major challenge in structural variation discovery, in particular, for insert size based approaches. In this size range, we even outperform split-read aligners. We achieve competitive results also on biological data, where our method is the only one to make a substantial amount of correct predictions, which, additionally, are disjoint from those by split-read aligners. AVAILABILITY: CLEVER is open source (GPL) and available from http://clever-sv.googlecode.com. CONTACT: as@cwi.nl or tm@cwi.nl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Variación Genética , Genoma Humano , Simulación por Computador , Humanos , Mutación INDEL
18.
Cell Syst ; 14(12): 1122-1130.e3, 2023 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-38128484

RESUMEN

The efficacy of epitope vaccines depends on the included epitopes as well as the probability that the selected epitopes are presented by the major histocompatibility complex (MHC) proteins of a vaccinated individual. Designing vaccines that effectively immunize a high proportion of the population is challenging because of high MHC polymorphism, diverging MHC-peptide binding affinities, and physical constraints on epitope vaccine constructs. Here, we present HOGVAX, a combinatorial optimization approach for epitope vaccine design. To optimize population coverage within the constraint of limited vaccine construct space, HOGVAX employs a hierarchical overlap graph (HOG) to identify and exploit overlaps between selected peptides and explicitly models the structure of linkage disequilibrium in the MHC. In a SARS-CoV-2 case study, we demonstrate that HOGVAX-designed vaccines contain substantially more epitopes than vaccines built from concatenated peptides and predict vaccine efficacy in over 98% of the population with high numbers of presented peptides in vaccinated individuals.


Asunto(s)
COVID-19 , Vacunas , Humanos , SARS-CoV-2 , COVID-19/prevención & control , Epítopos de Linfocito T , Péptidos
19.
Cell Genom ; 2(2)2022 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-35382456

RESUMEN

Recent genome-wide CRISPR-Cas9 loss-of-function screens have identified genetic dependencies across many cancer cell lines. Associations between these dependencies and genomic alterations in the same cell lines reveal phenomena such as oncogene addiction and synthetic lethality. However, comprehensive identification of such associations is complicated by complex interactions between genes across genetically heterogeneous cancer types. We introduce and apply the algorithm SuperDendrix to CRISPR-Cas9 loss-of-function screens from 769 cancer cell lines, to identify differential dependencies across cell lines and to find associations between differential dependencies and combinations of genomic alterations and cell-type-specific markers. These associations respect the position and type of interactions within pathways: for example, we observe increased dependencies on downstream activators of pathways, such as NFE2L2, and decreased dependencies on upstream activators of pathways, such as CDK6. SuperDendrix also reveals dozens of dependencies on lineage-specific transcription factors, identifies cancer-type-specific correlations between dependencies, and enables annotation of individual mutated residues.

20.
iScience ; 25(6): 104461, 2022 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-35692633

RESUMEN

An important challenge in genome assembly is haplotype phasing, that is, to reconstruct the different haplotype sequences of an individual genome. Phasing becomes considerably more difficult with increasing ploidy, which makes polyploid phasing a notoriously hard computational problem. We present a novel genetic phasing method for plant breeding with the aim to phase two deep-sequenced parental samples with the help of a large number of progeny samples sequenced at low depth. The key ideas underlying our approach are to (i) integrate the individually weak Mendelian progeny signals with a Bayesian log-likelihood model, (ii) cluster alleles according to their likelihood of co-occurrence, and (iii) assign them to haplotypes via an interval scheduling approach. We show on two deep-sequenced parental and 193 low-depth progeny potato samples that our approach computes high-quality sparse phasings and that it scales to whole genomes.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA