Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Bioinformatics ; 39(12)2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-38096571

RESUMEN

MOTIVATION: Analysis of mutational signatures is a powerful approach for understanding the mutagenic processes that have shaped the evolution of a cancer genome. To evaluate the mutational signatures operative in a cancer genome, one first needs to quantify their activities by estimating the number of mutations imprinted by each signature. RESULTS: Here we present SigProfilerAssignment, a desktop and an online computational framework for assigning all types of mutational signatures to individual samples. SigProfilerAssignment is the first tool that allows both analysis of copy-number signatures and probabilistic assignment of signatures to individual somatic mutations. As its computational engine, the tool uses a custom implementation of the forward stagewise algorithm for sparse regression and nonnegative least squares for numerical optimization. Analysis of 2700 synthetic cancer genomes with and without noise demonstrates that SigProfilerAssignment outperforms four commonly used approaches for assigning mutational signatures. AVAILABILITY AND IMPLEMENTATION: SigProfilerAssignment is available under the BSD 2-clause license at https://github.com/AlexandrovLab/SigProfilerAssignment with a web implementation at https://cancer.sanger.ac.uk/signatures/assignment/.


Asunto(s)
Neoplasias , Humanos , Mutación , Neoplasias/genética , Algoritmos , Genoma
2.
bioRxiv ; 2023 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-37502962

RESUMEN

Analysis of mutational signatures is a powerful approach for understanding the mutagenic processes that have shaped the evolution of a cancer genome. Here we present SigProfilerAssignment, a desktop and an online computational framework for assigning all types of mutational signatures to individual samples. SigProfilerAssignment is the first tool that allows both analysis of copy-number signatures and probabilistic assignment of signatures to individual somatic mutations. As its computational engine, the tool uses a custom implementation of the forward stagewise algorithm for sparse regression and nonnegative least squares for numerical optimization. Analysis of 2,700 synthetic cancer genomes with and without noise demonstrates that SigProfilerAssignment outperforms four commonly used approaches for assigning mutational signatures. SigProfilerAssignment is freely available at https://github.com/AlexandrovLab/SigProfilerAssignment with a web implementation at https://cancer.sanger.ac.uk/signatures/assignment/.

4.
Nat Cancer ; 4(2): 276-289, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36702933

RESUMEN

Analysis of mutational signatures can reveal underlying molecular mechanisms of the processes that have imprinted the somatic mutations found in cancer genomes. Here, we analyze single base substitutions and small insertions and deletions in pediatric cancers encompassing 785 whole-genome sequenced tumors from 27 molecularly defined cancer subtypes. We identified only a small number of mutational signatures active in pediatric cancers, compared with previously analyzed adult cancers. Further, we report a significant difference in the proportion of pediatric tumors showing homologous recombination repair defect signatures compared with previous analyses. In pediatric leukemias, we identified an indel signature, not previously reported, characterized by long insertions in nonrepeat regions, affecting mainly intronic and intergenic regions, but also exons of known cancer genes. We provide a systematic overview of COSMIC v.3 mutational signatures active across pediatric cancers, which is highly relevant for understanding tumor biology and enabling future research in defining biomarkers of treatment response.


Asunto(s)
Neoplasias , Adulto , Humanos , Niño , Mutación , Neoplasias/genética , Oncogenes , Mutación INDEL , Reparación del ADN
5.
Cell Genom ; 2(11): None, 2022 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-36388765

RESUMEN

Mutational signature analysis is commonly performed in cancer genomic studies. Here, we present SigProfilerExtractor, an automated tool for de novo extraction of mutational signatures, and benchmark it against another 13 bioinformatics tools by using 34 scenarios encompassing 2,500 simulated signatures found in 60,000 synthetic genomes and 20,000 synthetic exomes. For simulations with 5% noise, reflecting high-quality datasets, SigProfilerExtractor outperforms other approaches by elucidating between 20% and 50% more true-positive signatures while yielding 5-fold less false-positive signatures. Applying SigProfilerExtractor to 4,643 whole-genome- and 19,184 whole-exome-sequenced cancers reveals four novel signatures. Two of the signatures are confirmed in independent cohorts, and one of these signatures is associated with tobacco smoking. In summary, this report provides a reference tool for analysis of mutational signatures, a comprehensive benchmarking of bioinformatics tools for extracting signatures, and several novel mutational signatures, including one putatively attributed to direct tobacco smoking mutagenesis in bladder tissues.

6.
Nature ; 606(7916): 984-991, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35705804

RESUMEN

Gains and losses of DNA are prevalent in cancer and emerge as a consequence of inter-related processes of replication stress, mitotic errors, spindle multipolarity and breakage-fusion-bridge cycles, among others, which may lead to chromosomal instability and aneuploidy1,2. These copy number alterations contribute to cancer initiation, progression and therapeutic resistance3-5. Here we present a conceptual framework to examine the patterns of copy number alterations in human cancer that is widely applicable to diverse data types, including whole-genome sequencing, whole-exome sequencing, reduced representation bisulfite sequencing, single-cell DNA sequencing and SNP6 microarray data. Deploying this framework to 9,873 cancers representing 33 human cancer types from The Cancer Genome Atlas6 revealed a set of 21 copy number signatures that explain the copy number patterns of 97% of samples. Seventeen copy number signatures were attributed to biological phenomena of whole-genome doubling, aneuploidy, loss of heterozygosity, homologous recombination deficiency, chromothripsis and haploidization. The aetiologies of four copy number signatures remain unexplained. Some cancer types harbour amplicon signatures associated with extrachromosomal DNA, disease-specific survival and proto-oncogene gains such as MDM2. In contrast to base-scale mutational signatures, no copy number signature was associated with many known exogenous cancer risk factors. Our results synthesize the global landscape of copy number alterations in human cancer by revealing a diversity of mutational processes that give rise to these alterations.


Asunto(s)
Variaciones en el Número de Copia de ADN , Análisis Mutacional de ADN , Neoplasias , Aneuploidia , Cromotripsis , Variaciones en el Número de Copia de ADN/genética , Haploidia , Recombinación Homóloga/genética , Humanos , Pérdida de Heterocigocidad/genética , Mutación , Neoplasias/genética , Neoplasias/patología , Secuenciación del Exoma
7.
Nat Genet ; 53(11): 1553-1563, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34663923

RESUMEN

Esophageal squamous cell carcinoma (ESCC) shows remarkable variation in incidence that is not fully explained by known lifestyle and environmental risk factors. It has been speculated that an unknown exogenous exposure(s) could be responsible. Here we combine the fields of mutational signature analysis with cancer epidemiology to study 552 ESCC genomes from eight countries with varying incidence rates. Mutational profiles were similar across all countries studied. Associations between specific mutational signatures and ESCC risk factors were identified for tobacco, alcohol, opium and germline variants, with modest impacts on mutation burden. We find no evidence of a mutational signature indicative of an exogenous exposure capable of explaining differences in ESCC incidence. Apolipoprotein B mRNA-editing enzyme, catalytic polypeptide-like (APOBEC)-associated mutational signatures single-base substitution (SBS)2 and SBS13 were present in 88% and 91% of cases, respectively, and accounted for 25% of the mutation burden on average, indicating that APOBEC activation is a crucial step in ESCC tumor development.


Asunto(s)
Neoplasias Esofágicas/epidemiología , Neoplasias Esofágicas/genética , Carcinoma de Células Escamosas de Esófago/epidemiología , Carcinoma de Células Escamosas de Esófago/genética , Mutación , Desaminasas APOBEC/genética , Adulto , Anciano , Anciano de 80 o más Años , Aldehído Deshidrogenasa Mitocondrial/genética , Brasil/epidemiología , China/epidemiología , Femenino , Humanos , Incidencia , Irán/epidemiología , Masculino , Persona de Mediana Edad , Proteína p53 Supresora de Tumor/genética , Reino Unido/epidemiología , Secuenciación Completa del Genoma
8.
Nat Genet ; 53(9): 1348-1359, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34493867

RESUMEN

Lung cancer in never smokers (LCINS) is a common cause of cancer mortality but its genomic landscape is poorly characterized. Here high-coverage whole-genome sequencing of 232 LCINS showed 3 subtypes defined by copy number aberrations. The dominant subtype (piano), which is rare in lung cancer in smokers, features somatic UBA1 mutations, germline AR variants and stem cell-like properties, including low mutational burden, high intratumor heterogeneity, long telomeres, frequent KRAS mutations and slow growth, as suggested by the occurrence of cancer drivers' progenitor cells many years before tumor diagnosis. The other subtypes are characterized by specific amplifications and EGFR mutations (mezzo-forte) and whole-genome doubling (forte). No strong tobacco smoking signatures were detected, even in cases with exposure to secondhand tobacco smoke. Genes within the receptor tyrosine kinase-Ras pathway had distinct impacts on survival; five genomic alterations independently doubled mortality. These findings create avenues for personalized treatment in LCINS.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , No Fumadores/estadística & datos numéricos , Adulto , Anciano , Anciano de 80 o más Años , Receptores ErbB/genética , Femenino , Genoma/genética , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Células Madre Neoplásicas/patología , Proteínas Proto-Oncogénicas p21(ras)/genética , Receptores Androgénicos/genética , Factores de Riesgo , Fumar/genética , Enzimas Activadoras de Ubiquitina/genética , Secuenciación Completa del Genoma , Adulto Joven
9.
Cancer Discov ; 11(5): 1082-1099, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33408242

RESUMEN

Effective data sharing is key to accelerating research to improve diagnostic precision, treatment efficacy, and long-term survival in pediatric cancer and other childhood catastrophic diseases. We present St. Jude Cloud (https://www.stjude.cloud), a cloud-based data-sharing ecosystem for accessing, analyzing, and visualizing genomic data from >10,000 pediatric patients with cancer and long-term survivors, and >800 pediatric sickle cell patients. Harmonized genomic data totaling 1.25 petabytes are freely available, including 12,104 whole genomes, 7,697 whole exomes, and 2,202 transcriptomes. The resource is expanding rapidly, with regular data uploads from St. Jude's prospective clinical genomics programs. Three interconnected apps within the ecosystem-Genomics Platform, Pediatric Cancer Knowledgebase, and Visualization Community-enable simultaneously performing advanced data analysis in the cloud and enhancing the Pediatric Cancer knowledgebase. We demonstrate the value of the ecosystem through use cases that classify 135 pediatric cancer subtypes by gene expression profiling and map mutational signatures across 35 pediatric cancer subtypes. SIGNIFICANCE: To advance research and treatment of pediatric cancer, we developed St. Jude Cloud, a data-sharing ecosystem for accessing >1.2 petabytes of raw genomic data from >10,000 pediatric patients and survivors, innovative analysis workflows, integrative multiomics visualizations, and a knowledgebase of published data contributed by the global pediatric cancer community.This article is highlighted in the In This Issue feature, p. 995.


Asunto(s)
Anemia de Células Falciformes/genética , Nube Computacional , Genómica , Difusión de la Información , Neoplasias/genética , Niño , Ecosistema , Hospitales Pediátricos , Humanos
10.
Methods Mol Biol ; 2185: 447-473, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33165866

RESUMEN

The genome of a cancer contains somatic mutations that reflect the activities of endogenous and exogenous mutational processes, with each mutational process imprinting a characteristic mutational signature. Computational analysis of somatic mutations derived from next-generation sequencing data allows revealing the mutational signatures operative in a set of cancer genomes. In this chapter, we briefly review the concept of mutational signatures and the tools available for deciphering mutational signatures. Further, we provide a quick guide as well as an in-depth protocol for deciphering mutational signatures using the tool SigProfilerExtractor and review the results generated from an example dataset of cancer genomes.


Asunto(s)
Biología Computacional , Bases de Datos de Ácidos Nucleicos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Mutación , Neoplasias/genética , Humanos
11.
Nature ; 578(7793): 94-101, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32025018

RESUMEN

Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature1. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses3-15, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated-but distinct-DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer.


Asunto(s)
Mutación/genética , Neoplasias/genética , Factores de Edad , Secuencia de Bases , Exoma/genética , Genoma Humano/genética , Humanos , Análisis de Secuencia de ADN
12.
Biotechnol Lett ; 42(2): 287-294, 2020 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31802334

RESUMEN

OBJECTIVES: Targeted therapies seek to selectively eliminate a pathogen without disrupting the resident microbial community. However, with selectivity comes the potential for developing bacterial resistance. Thus, a diverse range of targeting peptides must be made available. RESULTS: Two commonly used antimicrobial peptides (AMPs), plectasin and eurocin, were genetically fused to the targeting peptide A12C, which selectively binds to Staphylococcus species. The targeting peptide did not decrease activity against the targeted Staphylococcus aureus and Staphylococcus epidermidis, but drastically decreased activity against the nontargeted species, Enterococcus faecalis, Bacillus subtilis, Lactococcus lactis and Lactobacillus rhamnosus. This effect was equally evident across two different AMPs, two different species of Staphylococcus, four different negative control bacteria, and against both biofilm and planktonic forms of the bacteria. CONCLUSIONS: A12C, originally designed for targeted drug delivery, was repurposed to target antimicrobial peptides. This illustrates the wealth of ligands, both natural and synthetic, which can be adapted to develop a diverse array of targeting antimicrobial peptides.


Asunto(s)
Péptidos Catiónicos Antimicrobianos/farmacología , Defensinas/genética , Péptidos/genética , Staphylococcus/crecimiento & desarrollo , Péptidos Catiónicos Antimicrobianos/genética , Reposicionamiento de Medicamentos , Fusión Génica , Viabilidad Microbiana/efectos de los fármacos , Especificidad de la Especie , Staphylococcus/efectos de los fármacos
13.
N Biotechnol ; 56: 63-70, 2020 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-31812667

RESUMEN

As antibiotic-resistant bacterial pathogens become an ever-increasing concern, antimicrobial peptides (AMPs) have grown increasingly attractive as alternatives. Potentially, plants could be used as cost-effective AMP bioreactors; however, reported heterologous AMP expression is much lower in plants than in E. coli expression systems and often results in plant cytotoxicity, even for AMPs fused to carrier proteins. This suggests that there may be a physical characteristic of the previously described heterologous AMPs which impedes efficient expression in plants. Using a meta-analysis of protein databases, this study has determined that native plant AMPs were significantly less cationic than AMPs native to other taxa. To apply this finding to plant expression, the transient expression of 10 different heterologous AMPs, ranging in charge from +7 to -5, was tested in the tobacco, Nicotiana benthamiana. Elastin-like polypeptide (ELP) was used as the carrier protein for AMP expression. ELP fusion allowed for a simple, cost-effective temperature shift purification. Using this system, all five anionic AMPs expressed well, with two at unusually high levels (375 and 563 µg/gfw). Furthermore, antimicrobial activity against Staphylococcus epidermidis was an order of magnitude greater (average minimum inhibitory concentration MIC of 0.26µM) than that typically seen for AMPs expressed in E. coli systems and was associated with the uncleaved fusion peptide. In summary, this study describes a means of expressing AMP fusions in plants in high yield, purified by a simple temperature-shift protocol, resulting in a fusion peptide with high antimicrobial activity and without the need for a peptide cleavage step.


Asunto(s)
Antibacterianos/farmacología , Péptidos Catiónicos Antimicrobianos/farmacología , Reactores Biológicos/economía , Nicotiana/química , Staphylococcus epidermidis/efectos de los fármacos , Antibacterianos/química , Antibacterianos/metabolismo , Péptidos Catiónicos Antimicrobianos/química , Péptidos Catiónicos Antimicrobianos/metabolismo , Bases de Datos de Proteínas , Relación Dosis-Respuesta a Droga , Pruebas de Sensibilidad Microbiana , Temperatura , Nicotiana/metabolismo
14.
Toxins (Basel) ; 10(6)2018 06 19.
Artículo en Inglés | MEDLINE | ID: mdl-29921767

RESUMEN

Cystine-stabilized peptides represent a large family of peptides characterized by high structural stability and bactericidal, fungicidal, or insecticidal properties. Found throughout a wide range of taxa, this broad and functionally important family can be subclassified into distinct groups dependent upon their number and type of cystine bonding patters, tertiary structures, and/or their species of origin. Furthermore, the annotation of proteins related to the cystine-stabilized family are under-represented in the literature due to their difficulty of isolation and identification. As a result, there are several recent attempts to collate them into data resources and build analytic tools for their dynamic prediction. Ultimately, the identification and delivery of new members of this family will lead to their growing inclusion into the repertoire of commercial viable alternatives to antibiotics and environmentally safe insecticides. This review of the literature and current state of cystine-stabilized peptide biology is aimed to better describe peptide subfamilies, identify databases and analytics resources associated with specific cystine-stabilized peptides, and highlight their current commercial success.


Asunto(s)
Péptidos/clasificación , Animales , Simulación por Computador , Cistina/química , Bases de Datos Factuales , Péptidos/química
15.
Sci Rep ; 8(1): 9049, 2018 06 13.
Artículo en Inglés | MEDLINE | ID: mdl-29899538

RESUMEN

Cystine-stabilized peptides have great utility as they naturally block ion channels, inhibit acetylcholine receptors, or inactivate microbes. However, only a tiny fraction of these peptides has been characterized. Exploration for novel peptides most efficiently starts with the identification of candidates from genome sequence data. Unfortunately, though cystine-stabilized peptides have shared structures, they have low DNA sequence similarity, restricting the utility of BLAST and even more powerful sequence alignment-based annotation algorithms, such as PSI-BLAST and HMMER. In contrast, a supervised machine learning approach may improve discovery and function assignment of these peptides. To this end, we employed our previously described m-NGSG algorithm, which utilizes hidden signatures embedded in peptide primary sequences that define and categorize structural or functional classes of peptides. From the generalized m-NGSG framework, we derived five specific models that categorize cystine-stabilized peptide sequences into specific functional classes. When compared with PSI-BLAST, HMMER and existing function-specific models, our novel approach (named CSPred) consistently demonstrates superior performance in discovery and function-assignment. We also report an interactive version of CSPred, available through download ( https://bitbucket.org/sm_islam/cystine-stabilized-proteins/src ) or web interface (watson.ecs.baylor.edu/cspred), for the discovery of cystine-stabilized peptides of specific function from genomic datasets and for genome annotation. We fully describe, in the Availability section following the Discussion, the quick and simple usage of the CsPred website to automatically deliver function assignments for batch submissions of peptide sequences.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Cistina/química , Péptidos/química , Secuencia de Aminoácidos , Cistina/genética , Internet , Péptidos/clasificación , Péptidos/genética , Reproducibilidad de los Resultados , Aprendizaje Automático Supervisado
16.
BMC Bioinformatics ; 16: 210, 2015 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-26142484

RESUMEN

BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. RESULTS: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86%, 94.11%, 84.31%, 94.30% and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. CONCLUSION: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/.


Asunto(s)
Algoritmos , Cistina/química , Disulfuros/química , Modelos Estadísticos , Fragmentos de Péptidos/química , Fragmentos de Péptidos/farmacología , Máquina de Vectores de Soporte , Secuencia de Aminoácidos , Animales , Datos de Secuencia Molecular
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA