Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Mol Cell ; 82(19): 3538-3552.e5, 2022 10 06.
Artículo en Inglés | MEDLINE | ID: mdl-36075220

RESUMEN

DNA becomes single stranded (ssDNA) during replication, transcription, and repair. Transiently formed ssDNA segments can adopt alternative conformations, including cruciforms, triplexes, and quadruplexes. To determine whether there are stable regions of ssDNA in the human genome, we utilized S1-END-seq to convert ssDNA regions to DNA double-strand breaks, which were then processed for high-throughput sequencing. This approach revealed two predominant non-B DNA structures: cruciform DNA formed by expanded (TA)n repeats that accumulate in microsatellite unstable human cancer cell lines and DNA triplexes (H-DNA) formed by homopurine/homopyrimidine mirror repeats common across a variety of cell lines. We show that H-DNA is enriched during replication, that its genomic location is highly conserved, and that H-DNA formed by (GAA)n repeats can be disrupted by treatment with a (GAA)n-binding polyamide. Finally, we show that triplex-forming repeats are hotspots for mutagenesis. Our results identify dynamic DNA secondary structures in vivo that contribute to elevated genome instability.


Asunto(s)
ADN Cruciforme , Nylons , ADN/metabolismo , Roturas del ADN de Doble Cadena , Replicación del ADN , Humanos , Conformación de Ácido Nucleico
2.
Mol Cell ; 81(12): 2611-2624.e10, 2021 06 17.
Artículo en Inglés | MEDLINE | ID: mdl-33857404

RESUMEN

The Shieldin complex shields double-strand DNA breaks (DSBs) from nucleolytic resection. Curiously, the penultimate Shieldin component, SHLD1, is one of the least abundant mammalian proteins. Here, we report that the transcription factors THAP1, YY1, and HCF1 bind directly to the SHLD1 promoter, where they cooperatively maintain the low basal expression of SHLD1, thereby ensuring a proper balance between end protection and resection during DSB repair. The loss of THAP1-dependent SHLD1 expression confers cross-resistance to poly (ADP-ribose) polymerase (PARP) inhibitor and cisplatin in BRCA1-deficient cells and shorter progression-free survival in ovarian cancer patients. Moreover, the embryonic lethality and PARPi sensitivity of BRCA1-deficient mice is rescued by ablation of SHLD1. Our study uncovers a transcriptional network that directly controls DSB repair choice and suggests a potential link between DNA damage and pathogenic THAP1 mutations, found in patients with the neurodevelopmental movement disorder adult-onset torsion dystonia type 6.


Asunto(s)
Proteínas de Ciclo Celular/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Animales , Proteína BRCA1/genética , Proteína BRCA1/metabolismo , Proteínas de Ciclo Celular/genética , ADN/metabolismo , Roturas del ADN de Doble Cadena/efectos de los fármacos , Reparación del ADN por Unión de Extremidades/efectos de los fármacos , Reparación del ADN/genética , Distonía/genética , Femenino , Factor C1 de la Célula Huésped/metabolismo , Proteínas Mad2/genética , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Poli(ADP-Ribosa) Polimerasa-1/metabolismo , Inhibidores de Poli(ADP-Ribosa) Polimerasas/farmacología , Reparación del ADN por Recombinación/efectos de los fármacos , Proteínas de Unión a Telómeros/metabolismo , Proteína 1 de Unión al Supresor Tumoral P53/metabolismo , Factor de Transcripción YY1/metabolismo
3.
Nature ; 612(7941): 758-763, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36517603

RESUMEN

Coronavirus disease 2019 (COVID-19) is known to cause multi-organ dysfunction1-3 during acute infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with some patients experiencing prolonged symptoms, termed post-acute sequelae of SARS-CoV-2 (refs. 4,5). However, the burden of infection outside the respiratory tract and time to viral clearance are not well characterized, particularly in the brain3,6-14. Here we carried out complete autopsies on 44 patients who died with COVID-19, with extensive sampling of the central nervous system in 11 of these patients, to map and quantify the distribution, replication and cell-type specificity of SARS-CoV-2 across the human body, including the brain, from acute infection to more than seven months following symptom onset. We show that SARS-CoV-2 is widely distributed, predominantly among patients who died with severe COVID-19, and that virus replication is present in multiple respiratory and non-respiratory tissues, including the brain, early in infection. Further, we detected persistent SARS-CoV-2 RNA in multiple anatomic sites, including throughout the brain, as late as 230 days following symptom onset in one case. Despite extensive distribution of SARS-CoV-2 RNA throughout the body, we observed little evidence of inflammation or direct viral cytopathology outside the respiratory tract. Our data indicate that in some patients SARS-CoV-2 can cause systemic infection and persist in the body for months.


Asunto(s)
Autopsia , Encéfalo , COVID-19 , Especificidad de Órganos , SARS-CoV-2 , Humanos , Encéfalo/virología , COVID-19/virología , ARN Viral/análisis , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificación , SARS-CoV-2/patogenicidad , SARS-CoV-2/fisiología , Replicación Viral , Factores de Tiempo , Sistema Respiratorio/patología , Sistema Respiratorio/virología
4.
Nature ; 586(7828): 292-298, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32999459

RESUMEN

The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair1-4. Depletion of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides5. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.


Asunto(s)
Roturas del ADN de Doble Cadena , Expansión de las Repeticiones de ADN/genética , Repeticiones de Dinucleótido/genética , Neoplasias/genética , Helicasa del Síndrome de Werner/metabolismo , Proteínas de la Ataxia Telangiectasia Mutada/metabolismo , Línea Celular Tumoral , Cromosomas Humanos/genética , Cromosomas Humanos/metabolismo , Cromotripsis , División del ADN , Replicación del ADN , Proteínas de Unión al ADN/metabolismo , Endodesoxirribonucleasas/metabolismo , Endonucleasas/metabolismo , Inestabilidad Genómica , Humanos , Recombinasas/metabolismo
5.
Int J Mol Sci ; 22(4)2021 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-33672790

RESUMEN

Nonsense mutations turn a coding (sense) codon into an in-frame stop codon that is assumed to result in a truncated protein product. Thus, nonsense substitutions are the hallmark of pseudogenes and are used to identify them. Here we show that in-frame stop codons within bacterial protein-coding genes are widespread. Their evolutionary conservation suggests that many of them are not pseudogenes, since they maintain dN/dS values (ratios of substitution rates at non-synonymous and synonymous sites) significantly lower than 1 (this is a signature of purifying selection in protein-coding regions). We also found that double substitutions in codons-where an intermediate step is a nonsense substitution-show a higher rate of evolution compared to null models, indicating that a stop codon was introduced and then changed back to sense via positive selection. This further supports the notion that nonsense substitutions in bacteria are relatively common and do not necessarily cause pseudogenization. In-frame stop codons may be an important mechanism of regulation: Such codons are likely to cause a substantial decrease of protein expression levels.


Asunto(s)
Codón sin Sentido , Codón de Terminación/genética , Sistemas de Lectura Abierta/genética , Células Procariotas/metabolismo , Bacterias/clasificación , Bacterias/genética , Proteínas Bacterianas/clasificación , Proteínas Bacterianas/genética , Secuencia de Bases , Evolución Molecular , Modelos Genéticos , Filogenia , Mutación Puntual , Seudogenes/genética , Selección Genética , Homología de Secuencia de Ácido Nucleico
6.
BMC Biol ; 17(1): 105, 2019 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-31842858

RESUMEN

BACKGROUND: Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. RESULTS: We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as "ancestral-intermediate-final" sequences (where "intermediate" refers to the first single substitution and "final" refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate-S final; (2) SN, S intermediate-N final; (3) NS, N intermediate-S final; and (4) NN, N intermediate-N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. CONCLUSIONS: The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.


Asunto(s)
Codón/genética , Evolución Molecular , Mutación , Células Procariotas/fisiología , Selección Genética
7.
Proc Natl Acad Sci U S A ; 113(46): 13109-13113, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27799560

RESUMEN

Serine is the only amino acid that is encoded by two disjoint codon sets so that a tandem substitution of two nucleotides is required to switch between the two sets. Previously published evidence suggests that, for the most evolutionarily conserved serines, the codon set switch occurs by simultaneous substitution of two nucleotides. Here we report a genome-wide reconstruction of the evolution of serine codons in triplets of closely related species from diverse prokaryotes and eukaryotes. The results indicate that the great majority of codon set switches proceed by two consecutive nucleotide substitutions, via a threonine or cysteine intermediate, and are driven by selection. These findings imply a strong pressure of purifying selection in protein evolution, which in the case of serine codon set switches occurs via an initial deleterious substitution quickly followed by a second, compensatory substitution. The result is frequent reversal of amino acid replacements and, at short evolutionary distances, pervasive homoplasy.


Asunto(s)
Codón/genética , Serina/genética , Animales , Archaea/genética , Bacterias/genética , Evolución Molecular , Humanos , Mutación , Saccharomyces/genética , Selección Genética
8.
Biomed Eng Online ; 16(Suppl 1): 72, 2017 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-28830434

RESUMEN

BACKGROUND: A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient's disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses. RESULTS: VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness-aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status-high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards' diverse gene-to-gene relationships, including SuperPaths-integrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of "disease SuperPaths", generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease-disease relationships. The advent of whole genome sequencing necessitates capacities to go beyond protein coding genes. GeneCards is highly useful in this respect, as it also addresses 101,976 non-protein-coding RNA genes. In a more recent development, we are currently adding an inclusive map of regulatory elements and their inferred target genes, generated by integration from 4 resources. CONCLUSIONS: MalaCards provides a rich big-data scaffold for in silico biomedical discovery within the gene-disease universe. VarElect, which depends significantly on both GeneCards and MalaCards power, is a potent tool for supporting the interpretation of wet-lab experiments, notably NGS analyses of disease. The GeneCards suite has thus transcended its 2-decade role in biomedical research, maturing into a key player in clinical investigation.


Asunto(s)
Biología Computacional/métodos , Enfermedad/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Bases de Datos Genéticas , Genómica , Humanos , Fenotipo
9.
BMC Genomics ; 17 Suppl 2: 444, 2016 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-27357693

RESUMEN

BACKGROUND: Next generation sequencing (NGS) provides a key technology for deciphering the genetic underpinnings of human diseases. Typical NGS analyses of a patient depict tens of thousands non-reference coding variants, but only one or very few are expected to be significant for the relevant disorder. In a filtering stage, one employs family segregation, rarity in the population, predicted protein impact and evolutionary conservation as a means for shortening the variation list. However, narrowing down further towards culprit disease genes usually entails laborious seeking of gene-phenotype relationships, consulting numerous separate databases. Thus, a major challenge is to transition from the few hundred shortlisted genes to the most viable disease-causing candidates. RESULTS: We describe a novel tool, VarElect ( http://ve.genecards.org ), a comprehensive phenotype-dependent variant/gene prioritizer, based on the widely-used GeneCards, which helps rapidly identify causal mutations with extensive evidence. The GeneCards suite offers an effective and speedy alternative, whereby >120 gene-centric automatically-mined data sources are jointly available for the task. VarElect cashes on this wealth of information, as well as on GeneCards' powerful free-text Boolean search and scoring capabilities, proficiently matching variant-containing genes to submitted disease/symptom keywords. The tool also leverages the rich disease and pathway information of MalaCards, the human disease database, and PathCards, the unified pathway (SuperPaths) database, both within the GeneCards Suite. The VarElect algorithm infers direct as well as indirect links between genes and phenotypes, the latter benefitting from GeneCards' diverse gene-to-gene data links in GenesLikeMe. Finally, our tool offers an extensive gene-phenotype evidence portrayal ("MiniCards") and hyperlinks to the parent databases. CONCLUSIONS: We demonstrate that VarElect compares favorably with several often-used NGS phenotyping tools, thus providing a robust facility for ranking genes, pointing out their likelihood to be related to a patient's disease. VarElect's capacity to automatically process numerous NGS cases, either in stand-alone format or in VCF-analyzer mode (TGex and VarAnnot), is indispensable for emerging clinical projects that involve thousands of whole exome/genome NGS analyses.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Algoritmos , Minería de Datos , Bases de Datos Genéticas , Genoma Humano , Humanos , Fenotipo
10.
Bioinformatics ; 29(2): 255-61, 2013 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-23172862

RESUMEN

MOTIVATION: Non-coding RNA (ncRNA) genes are increasingly acknowledged for their importance in the human genome. However, there is no comprehensive non-redundant database for all such human genes. RESULTS: We leveraged the effective platform of GeneCards, the human gene compendium, together with the power of fRNAdb and additional primary sources, to judiciously unify all ncRNA gene entries obtainable from 15 different primary sources. Overlapping entries were clustered to unified locations based on an algorithm employing genomic coordinates. This allowed GeneCards' gamut of relevant entries to rise ∼5-fold, resulting in ∼80,000 human non-redundant ncRNAs, belonging to 14 classes. Such 'grand unification' within a regularly updated data structure will assist future ncRNA research. AVAILABILITY AND IMPLEMENTATION: All of these non-coding RNAs are included among the ∼122,500 entries in GeneCards V3.09, along with pertinent annotation, automatically mined by its built-in pipeline from 100 data sources. This information is available at www.genecards.org. CONTACT: Frida.Belinky@weizmann.ac.il SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bases de Datos Genéticas , ARN no Traducido/genética , Algoritmos , Análisis por Conglomerados , Genes , Genoma Humano , Genómica , Humanos , Internet , Anotación de Secuencia Molecular
11.
bioRxiv ; 2024 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-38313289

RESUMEN

Previous studies have linked the evolution of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) genetic variants to persistent infections in people with immunocompromising conditions1-4, but the evolutionary processes underlying these observations are incompletely understood. Here we used high-throughput, single-genome amplification and sequencing (HT-SGS) to obtain up to ~103 SARS-CoV-2 spike gene sequences in each of 184 respiratory samples from 22 people with HIV (PWH) and 25 people without HIV (PWOH). Twelve of 22 PWH had advanced HIV infection, defined by peripheral blood CD4 T cell counts (i.e., CD4 counts) <200 cells/µL. In PWOH and PWH with CD4 counts ≥200 cells/µL, most single-genome spike sequences in each person matched one haplotype that predominated throughout the infection. By contrast, people with advanced HIV showed elevated intra-host spike diversity with a median of 46 haplotypes per person (IQR 14-114). Higher intra-host spike diversity immediately after COVID-19 symptom onset predicted longer SARS-CoV-2 RNA shedding among PWH, and intra-host spike diversity at this timepoint was significantly higher in people with advanced HIV than in PWOH. Composition of spike sequence populations in people with advanced HIV fluctuated rapidly over time, with founder sequences often replaced by groups of new haplotypes. These population-level changes were associated with a high total burden of intra-host mutations and positive selection at functionally important residues. In several cases, delayed emergence of detectable serum binding to spike was associated with positive selection for presumptive antibody-escape mutations. Taken together, our findings show remarkable intra-host genetic diversity of SARS-CoV-2 in advanced HIV infection and suggest that adaptive intra-host SARS-CoV-2 evolution in this setting may contribute to the emergence of new variants of concern (VOCs).

12.
Mol Phylogenet Evol ; 63(3): 702-13, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22387211

RESUMEN

Phylogenetic relationships within sponge classes are highly debated. The low phylogenetic signal observed with some current molecular data can be attributed to the use of few markers, usually slowly-evolving, such as the nuclear rDNA genes and the mitochondrial COI gene. In this study, we conducted a bioinformatics search for a new molecular marker. We sought a marker that (1) is likely to have no paralogs; (2) evolves under a fast evolutionary rate; (3) is part of a continuous exonic region; and (4) is flanked by conserved regions. Our search suggested the nuclear ALG11 as a potential suitable marker. We next demonstrated that this marker can indeed be used for solving phylogenetic relationships within sponges. Specifically, we successfully amplified the ALG11 gene from DNA samples of representatives from all four sponge classes as well as from several cnidarian classes. We also amplified the 18S rDNA and the COI gene for these species. Finally, we analyzed the phylogenetic performance of ALG11 to solve sponge relationships compared to and in combination with the nuclear 18S rDNA and the COI mtDNA genes. Interestingly, the ALG11 marker seems to be superior to the widely-used COI marker. Our work thus indicates that the ALG11 marker is a relevant marker which can complement and corroborate the phylogenetic inferences observed with nuclear ribosomal genes. This marker is also expected to contribute to resolving evolutionary relationships of other apparently slow-evolving animal phyla, such as cnidarians.


Asunto(s)
Complejo IV de Transporte de Electrones/genética , Manosiltransferasas/genética , Poríferos/genética , ARN Ribosómico 18S/genética , Animales , Teorema de Bayes , Marcadores Genéticos , Funciones de Verosimilitud , Datos de Secuencia Molecular , Filogenia , Alineación de Secuencia , Análisis de Secuencia de ADN
13.
Hum Genomics ; 5(6): 709-17, 2011 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22155609

RESUMEN

Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org). This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot) for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.


Asunto(s)
Bases de Datos Genéticas , Genes/genética , Genoma Humano , Genómica , Biología Computacional , Humanos
14.
Front Genet ; 13: 991249, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36159983

RESUMEN

Nucleotide substitutions in protein-coding genes can be divided into synonymous (S) and non-synonymous (N) ones that alter amino acids (including nonsense mutations causing stop codons). The S substitutions are expected to have little effect on function. The N substitutions almost always are affected by strong purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases can modulate the deleterious effect of single N substitutions and, thus, could be subjected to the positive selection. This effect has been demonstrated for mutations in the serine codons, stop codons and double N substitutions in prokaryotes. In all abovementioned cases, a novel technique was applied that allows elucidating the effects of selection on double substitutions considering mutational biases. Here, we applied the same technique to study double N substitutions in eukaryotic lineages of primates and yeast. We identified markedly fewer cases of purifying selection relative to prokaryotes and no evidence of codon double substitutions under positive selection. This is consistent with previous studies of serine codons in primates and yeast. In general, the obtained results strongly suggest that there are major differences between studied pro- and eukaryotes; double substitutions in primates and yeasts largely reflect mutational biases and are not hallmarks of selection. This is especially important in the context of detection of positive selection in codons because it has been suggested that multiple mutations in codons cause false inferences of lineage-specific site positive selection. It is likely that this concern is applicable to previously studied prokaryotes but not to primates and yeasts where markedly fewer double substitutions are affected by positive selection.

15.
Mol Biol Evol ; 27(2): 441-51, 2010 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-19864469

RESUMEN

Insertions and deletions (indels) are considered to be rare evolutionary events, the analysis of which may resolve controversial phylogenetic relationships. Indeed, indel characters are often assumed to be less homoplastic than amino acid and nucleotide substitutions and, consequently, more reliable markers for phylogenetic reconstruction. In this study, we analyzed indels from over 1,000 metazoan orthologous genes. We studied the impact of different species sampling, ortholog data sets, lengths of included indels, and indel-coding methods on the resulting metazoan tree. Our results show that, similar to sequence substitutions, indels are homoplastic characters, and their analysis is sensitive to the long-branch attraction artifact. Furthermore, improving the taxon sampling and choosing a closely related outgroup greatly impact the phylogenetic inference. Our indel-based inferences support the Ecdysozoa hypothesis over the Coelomata hypothesis and suggest that sponges are a sister clade to other animals.


Asunto(s)
Mutación INDEL/genética , Invertebrados/clasificación , Invertebrados/genética , Filogenia , Secuencia de Aminoácidos , Animales , Evolución Molecular , Datos de Secuencia Molecular , Poríferos/clasificación , Poríferos/genética , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Homología de Secuencia de Aminoácido
16.
Bioinformatics ; 26(22): 2914-5, 2010 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-20876605

RESUMEN

UNLABELLED: The evolutionary analysis of presence and absence profiles (phyletic patterns) is widely used in biology. It is assumed that the observed phyletic pattern is the result of gain and loss dynamics along a phylogenetic tree. Examples of characters that are represented by phyletic patterns include restriction sites, gene families, introns and indels, to name a few. Here, we present a user-friendly web server that accurately infers branch-specific and site-specific gain and loss events. The novel inference methodology is based on a stochastic mapping approach utilizing models that reliably capture the underlying evolutionary processes. A variety of features are available including the ability to analyze the data with various evolutionary models, to infer gain and loss events using either stochastic mapping or maximum parsimony, and to estimate gain and loss rates for each character analyzed. AVAILABILITY: Freely available for use at http://gloome.tau.ac.il/.


Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Programas Informáticos , Filogenia
17.
Cancers (Basel) ; 11(2)2019 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-30759888

RESUMEN

Cancer genomes accumulate nucleotide sequence variations that number in the tens of thousands per genome. A prominent fraction of these mutations is thought to arise as a consequence of the off-target activity of DNA/RNA editing cytosine deaminases. These enzymes, collectively called activation induced deaminase (AID)/APOBECs, deaminate cytosines located within defined DNA sequence contexts. The resulting changes of the original C:G pair in these contexts (mutational signatures) provide indirect evidence for the participation of specific cytosine deaminases in a given cancer type. The conventional method used for the analysis of mutable motifs is the consensus approach. Here, for the first time, we have adopted the frequently used weight matrix (sequence profile) approach for the analysis of mutagenesis and provide evidence for this method being a more precise descriptor of mutations than the sequence consensus approach. We confirm that while mutational footprints of APOBEC1, APOBEC3A, APOBEC3B, and APOBEC3G are prominent in many cancers, mutable motifs characteristic of the action of the humoral immune response somatic hypermutation enzyme, AID, are the most widespread feature of somatic mutation spectra attributable to deaminases in cancer genomes. Overall, the weight matrix approach reveals that somatic mutations are significantly associated with at least one AID/APOBEC mutable motif in all studied cancers.

18.
Sci Rep ; 8(1): 9260, 2018 06 18.
Artículo en Inglés | MEDLINE | ID: mdl-29915293

RESUMEN

Modes of evolution of stop codons in protein-coding genes, especially the conservation of UAA, have been debated for many years. We reconstructed the evolution of stop codons in 40 groups of closely related prokaryotic and eukaryotic genomes. The results indicate that the UAA codons are maintained by purifying selection in all domains of life. In contrast, positive selection appears to drive switches from UAG to other stop codons in prokaryotes but not in eukaryotes. Changes in stop codons are significantly associated with increased substitution frequency immediately downstream of the stop. These positions are otherwise more strongly conserved in evolution compared to sites farther downstream, suggesting that such substitutions are compensatory. Although GC content has a major impact on stop codon frequencies, its contribution to the decreased frequency of UAA differs between bacteria and archaea, presumably, due to differences in their translation termination mechanisms.


Asunto(s)
Codón de Terminación/genética , Evolución Molecular , Selección Genética , Composición de Base/genética , Escherichia coli/genética , Células Eucariotas/metabolismo , Genoma , Filogenia , Células Procariotas/metabolismo
19.
Sci Rep ; 7(1): 12422, 2017 09 29.
Artículo en Inglés | MEDLINE | ID: mdl-28963504

RESUMEN

Reconstruction of the evolution of start codons in 36 groups of closely related bacterial and archaeal genomes reveals purifying selection affecting AUG codons. The AUG starts are replaced by GUG and especially UUG significantly less frequently than expected under the neutral expectation derived from the frequencies of the respective nucleotide triplet substitutions in non-coding regions and in 4-fold degenerate sites. Thus, AUG is the optimal start codon that is actively maintained by purifying selection. However, purifying selection on start codons is significantly weaker than the selection on the same codons in coding sequences, although the switches between the codons result in conservative amino acid substitutions. The only exception is the AUG to UUG switch that is strongly selected against among start codons. Selection on start codons is most pronounced in evolutionarily conserved, highly expressed genes. Mutation of the start codon to a sub-optimal form (GUG or UUG) tends to be compensated by mutations in the Shine-Dalgarno sequence towards a stronger translation initiation signal. Together, all these findings indicate that in prokaryotes, translation start signals are subject to weak but significant selection for maximization of initiation rate and, consequently, protein production.


Asunto(s)
Codón Iniciador/genética , Genoma Arqueal/genética , Genoma Bacteriano/genética , ARN Mensajero/genética , Selección Genética/genética , Escherichia coli/genética , Mutación
20.
Artículo en Inglés | MEDLINE | ID: mdl-25725062

RESUMEN

The study of biological pathways is key to a large number of systems analyses. However, many relevant tools consider a limited number of pathway sources, missing out on many genes and gene-to-gene connections. Simply pooling several pathways sources would result in redundancy and the lack of systematic pathway interrelations. To address this, we exercised a combination of hierarchical clustering and nearest neighbor graph representation, with judiciously selected cutoff values, thereby consolidating 3215 human pathways from 12 sources into a set of 1073 SuperPaths. Our unification algorithm finds a balance between reducing redundancy and optimizing the level of pathway-related informativeness for individual genes. We show a substantial enhancement of the SuperPaths' capacity to infer gene-to-gene relationships when compared with individual pathway sources, separately or taken together. Further, we demonstrate that the chosen 12 sources entail nearly exhaustive gene coverage. The computed SuperPaths are presented in a new online database, PathCards, showing each SuperPath, its constituent network of pathways, and its contained genes. This provides researchers with a rich, searchable systems analysis resource. Database URL: http://pathcards.genecards.org/


Asunto(s)
Vías Biosintéticas/fisiología , Bases de Datos Genéticas , Epistasis Genética/fisiología , Redes Reguladoras de Genes/fisiología , Humanos
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda