Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Mol Cell ; 82(19): 3538-3552.e5, 2022 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-36075220

RESUMO

DNA becomes single stranded (ssDNA) during replication, transcription, and repair. Transiently formed ssDNA segments can adopt alternative conformations, including cruciforms, triplexes, and quadruplexes. To determine whether there are stable regions of ssDNA in the human genome, we utilized S1-END-seq to convert ssDNA regions to DNA double-strand breaks, which were then processed for high-throughput sequencing. This approach revealed two predominant non-B DNA structures: cruciform DNA formed by expanded (TA)n repeats that accumulate in microsatellite unstable human cancer cell lines and DNA triplexes (H-DNA) formed by homopurine/homopyrimidine mirror repeats common across a variety of cell lines. We show that H-DNA is enriched during replication, that its genomic location is highly conserved, and that H-DNA formed by (GAA)n repeats can be disrupted by treatment with a (GAA)n-binding polyamide. Finally, we show that triplex-forming repeats are hotspots for mutagenesis. Our results identify dynamic DNA secondary structures in vivo that contribute to elevated genome instability.


Assuntos
DNA Cruciforme , Nylons , DNA/metabolismo , Quebras de DNA de Cadeia Dupla , Replicação do DNA , Humanos , Conformação de Ácido Nucleico
2.
Mol Cell ; 81(12): 2611-2624.e10, 2021 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-33857404

RESUMO

The Shieldin complex shields double-strand DNA breaks (DSBs) from nucleolytic resection. Curiously, the penultimate Shieldin component, SHLD1, is one of the least abundant mammalian proteins. Here, we report that the transcription factors THAP1, YY1, and HCF1 bind directly to the SHLD1 promoter, where they cooperatively maintain the low basal expression of SHLD1, thereby ensuring a proper balance between end protection and resection during DSB repair. The loss of THAP1-dependent SHLD1 expression confers cross-resistance to poly (ADP-ribose) polymerase (PARP) inhibitor and cisplatin in BRCA1-deficient cells and shorter progression-free survival in ovarian cancer patients. Moreover, the embryonic lethality and PARPi sensitivity of BRCA1-deficient mice is rescued by ablation of SHLD1. Our study uncovers a transcriptional network that directly controls DSB repair choice and suggests a potential link between DNA damage and pathogenic THAP1 mutations, found in patients with the neurodevelopmental movement disorder adult-onset torsion dystonia type 6.


Assuntos
Proteínas de Ciclo Celular/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Animais , Proteína BRCA1/genética , Proteína BRCA1/metabolismo , Proteínas de Ciclo Celular/genética , DNA/metabolismo , Quebras de DNA de Cadeia Dupla/efeitos dos fármacos , Reparo do DNA por Junção de Extremidades/efeitos dos fármacos , Reparo do DNA/genética , Distonia/genética , Feminino , Fator C1 de Célula Hospedeira/metabolismo , Proteínas Mad2/genética , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Poli(ADP-Ribose) Polimerase-1/metabolismo , Inibidores de Poli(ADP-Ribose) Polimerases/farmacologia , Reparo de DNA por Recombinação/efeitos dos fármacos , Proteínas de Ligação a Telômeros/metabolismo , Proteína 1 de Ligação à Proteína Supressora de Tumor p53/metabolismo , Fator de Transcrição YY1/metabolismo
3.
Nature ; 612(7941): 758-763, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36517603

RESUMO

Coronavirus disease 2019 (COVID-19) is known to cause multi-organ dysfunction1-3 during acute infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with some patients experiencing prolonged symptoms, termed post-acute sequelae of SARS-CoV-2 (refs. 4,5). However, the burden of infection outside the respiratory tract and time to viral clearance are not well characterized, particularly in the brain3,6-14. Here we carried out complete autopsies on 44 patients who died with COVID-19, with extensive sampling of the central nervous system in 11 of these patients, to map and quantify the distribution, replication and cell-type specificity of SARS-CoV-2 across the human body, including the brain, from acute infection to more than seven months following symptom onset. We show that SARS-CoV-2 is widely distributed, predominantly among patients who died with severe COVID-19, and that virus replication is present in multiple respiratory and non-respiratory tissues, including the brain, early in infection. Further, we detected persistent SARS-CoV-2 RNA in multiple anatomic sites, including throughout the brain, as late as 230 days following symptom onset in one case. Despite extensive distribution of SARS-CoV-2 RNA throughout the body, we observed little evidence of inflammation or direct viral cytopathology outside the respiratory tract. Our data indicate that in some patients SARS-CoV-2 can cause systemic infection and persist in the body for months.


Assuntos
Autopsia , Encéfalo , COVID-19 , Especificidade de Órgãos , SARS-CoV-2 , Humanos , Encéfalo/virologia , COVID-19/virologia , RNA Viral/análise , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , SARS-CoV-2/patogenicidade , SARS-CoV-2/fisiologia , Replicação Viral , Fatores de Tempo , Sistema Respiratório/patologia , Sistema Respiratório/virologia
4.
Nature ; 586(7828): 292-298, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32999459

RESUMO

The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair1-4. Depletion of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides5. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.


Assuntos
Quebras de DNA de Cadeia Dupla , Expansão das Repetições de DNA/genética , Repetições de Dinucleotídeos/genética , Neoplasias/genética , Helicase da Síndrome de Werner/metabolismo , Proteínas Mutadas de Ataxia Telangiectasia/metabolismo , Linhagem Celular Tumoral , Cromossomos Humanos/genética , Cromossomos Humanos/metabolismo , Cromotripsia , Clivagem do DNA , Replicação do DNA , Proteínas de Ligação a DNA/metabolismo , Endodesoxirribonucleases/metabolismo , Endonucleases/metabolismo , Instabilidade Genômica , Humanos , Recombinases/metabolismo
5.
Int J Mol Sci ; 22(4)2021 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-33672790

RESUMO

Nonsense mutations turn a coding (sense) codon into an in-frame stop codon that is assumed to result in a truncated protein product. Thus, nonsense substitutions are the hallmark of pseudogenes and are used to identify them. Here we show that in-frame stop codons within bacterial protein-coding genes are widespread. Their evolutionary conservation suggests that many of them are not pseudogenes, since they maintain dN/dS values (ratios of substitution rates at non-synonymous and synonymous sites) significantly lower than 1 (this is a signature of purifying selection in protein-coding regions). We also found that double substitutions in codons-where an intermediate step is a nonsense substitution-show a higher rate of evolution compared to null models, indicating that a stop codon was introduced and then changed back to sense via positive selection. This further supports the notion that nonsense substitutions in bacteria are relatively common and do not necessarily cause pseudogenization. In-frame stop codons may be an important mechanism of regulation: Such codons are likely to cause a substantial decrease of protein expression levels.


Assuntos
Códon sem Sentido , Códon de Terminação/genética , Fases de Leitura Aberta/genética , Células Procarióticas/metabolismo , Bactérias/classificação , Bactérias/genética , Proteínas de Bactérias/classificação , Proteínas de Bactérias/genética , Sequência de Bases , Evolução Molecular , Modelos Genéticos , Filogenia , Mutação Puntual , Pseudogenes/genética , Seleção Genética , Homologia de Sequência do Ácido Nucleico
6.
BMC Biol ; 17(1): 105, 2019 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-31842858

RESUMO

BACKGROUND: Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. RESULTS: We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as "ancestral-intermediate-final" sequences (where "intermediate" refers to the first single substitution and "final" refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate-S final; (2) SN, S intermediate-N final; (3) NS, N intermediate-S final; and (4) NN, N intermediate-N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. CONCLUSIONS: The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.


Assuntos
Códon/genética , Evolução Molecular , Mutação , Células Procarióticas/fisiologia , Seleção Genética
7.
Proc Natl Acad Sci U S A ; 113(46): 13109-13113, 2016 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-27799560

RESUMO

Serine is the only amino acid that is encoded by two disjoint codon sets so that a tandem substitution of two nucleotides is required to switch between the two sets. Previously published evidence suggests that, for the most evolutionarily conserved serines, the codon set switch occurs by simultaneous substitution of two nucleotides. Here we report a genome-wide reconstruction of the evolution of serine codons in triplets of closely related species from diverse prokaryotes and eukaryotes. The results indicate that the great majority of codon set switches proceed by two consecutive nucleotide substitutions, via a threonine or cysteine intermediate, and are driven by selection. These findings imply a strong pressure of purifying selection in protein evolution, which in the case of serine codon set switches occurs via an initial deleterious substitution quickly followed by a second, compensatory substitution. The result is frequent reversal of amino acid replacements and, at short evolutionary distances, pervasive homoplasy.


Assuntos
Códon/genética , Serina/genética , Animais , Archaea/genética , Bactérias/genética , Evolução Molecular , Humanos , Mutação , Saccharomyces/genética , Seleção Genética
8.
Biomed Eng Online ; 16(Suppl 1): 72, 2017 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-28830434

RESUMO

BACKGROUND: A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient's disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses. RESULTS: VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness-aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status-high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards' diverse gene-to-gene relationships, including SuperPaths-integrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of "disease SuperPaths", generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease-disease relationships. The advent of whole genome sequencing necessitates capacities to go beyond protein coding genes. GeneCards is highly useful in this respect, as it also addresses 101,976 non-protein-coding RNA genes. In a more recent development, we are currently adding an inclusive map of regulatory elements and their inferred target genes, generated by integration from 4 resources. CONCLUSIONS: MalaCards provides a rich big-data scaffold for in silico biomedical discovery within the gene-disease universe. VarElect, which depends significantly on both GeneCards and MalaCards power, is a potent tool for supporting the interpretation of wet-lab experiments, notably NGS analyses of disease. The GeneCards suite has thus transcended its 2-decade role in biomedical research, maturing into a key player in clinical investigation.


Assuntos
Biologia Computacional/métodos , Doença/genética , Sequenciamento de Nucleotídeos em Larga Escala , Bases de Dados Genéticas , Genômica , Humanos , Fenótipo
9.
BMC Genomics ; 17 Suppl 2: 444, 2016 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-27357693

RESUMO

BACKGROUND: Next generation sequencing (NGS) provides a key technology for deciphering the genetic underpinnings of human diseases. Typical NGS analyses of a patient depict tens of thousands non-reference coding variants, but only one or very few are expected to be significant for the relevant disorder. In a filtering stage, one employs family segregation, rarity in the population, predicted protein impact and evolutionary conservation as a means for shortening the variation list. However, narrowing down further towards culprit disease genes usually entails laborious seeking of gene-phenotype relationships, consulting numerous separate databases. Thus, a major challenge is to transition from the few hundred shortlisted genes to the most viable disease-causing candidates. RESULTS: We describe a novel tool, VarElect ( http://ve.genecards.org ), a comprehensive phenotype-dependent variant/gene prioritizer, based on the widely-used GeneCards, which helps rapidly identify causal mutations with extensive evidence. The GeneCards suite offers an effective and speedy alternative, whereby >120 gene-centric automatically-mined data sources are jointly available for the task. VarElect cashes on this wealth of information, as well as on GeneCards' powerful free-text Boolean search and scoring capabilities, proficiently matching variant-containing genes to submitted disease/symptom keywords. The tool also leverages the rich disease and pathway information of MalaCards, the human disease database, and PathCards, the unified pathway (SuperPaths) database, both within the GeneCards Suite. The VarElect algorithm infers direct as well as indirect links between genes and phenotypes, the latter benefitting from GeneCards' diverse gene-to-gene data links in GenesLikeMe. Finally, our tool offers an extensive gene-phenotype evidence portrayal ("MiniCards") and hyperlinks to the parent databases. CONCLUSIONS: We demonstrate that VarElect compares favorably with several often-used NGS phenotyping tools, thus providing a robust facility for ranking genes, pointing out their likelihood to be related to a patient's disease. VarElect's capacity to automatically process numerous NGS cases, either in stand-alone format or in VCF-analyzer mode (TGex and VarAnnot), is indispensable for emerging clinical projects that involve thousands of whole exome/genome NGS analyses.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Mineração de Dados , Bases de Dados Genéticas , Genoma Humano , Humanos , Fenótipo
10.
Bioinformatics ; 29(2): 255-61, 2013 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-23172862

RESUMO

MOTIVATION: Non-coding RNA (ncRNA) genes are increasingly acknowledged for their importance in the human genome. However, there is no comprehensive non-redundant database for all such human genes. RESULTS: We leveraged the effective platform of GeneCards, the human gene compendium, together with the power of fRNAdb and additional primary sources, to judiciously unify all ncRNA gene entries obtainable from 15 different primary sources. Overlapping entries were clustered to unified locations based on an algorithm employing genomic coordinates. This allowed GeneCards' gamut of relevant entries to rise ∼5-fold, resulting in ∼80,000 human non-redundant ncRNAs, belonging to 14 classes. Such 'grand unification' within a regularly updated data structure will assist future ncRNA research. AVAILABILITY AND IMPLEMENTATION: All of these non-coding RNAs are included among the ∼122,500 entries in GeneCards V3.09, along with pertinent annotation, automatically mined by its built-in pipeline from 100 data sources. This information is available at www.genecards.org. CONTACT: Frida.Belinky@weizmann.ac.il SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados Genéticas , RNA não Traduzido/genética , Algoritmos , Análise por Conglomerados , Genes , Genoma Humano , Genômica , Humanos , Internet , Anotação de Sequência Molecular
11.
bioRxiv ; 2024 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-38313289

RESUMO

Previous studies have linked the evolution of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) genetic variants to persistent infections in people with immunocompromising conditions1-4, but the evolutionary processes underlying these observations are incompletely understood. Here we used high-throughput, single-genome amplification and sequencing (HT-SGS) to obtain up to ~103 SARS-CoV-2 spike gene sequences in each of 184 respiratory samples from 22 people with HIV (PWH) and 25 people without HIV (PWOH). Twelve of 22 PWH had advanced HIV infection, defined by peripheral blood CD4 T cell counts (i.e., CD4 counts) <200 cells/µL. In PWOH and PWH with CD4 counts ≥200 cells/µL, most single-genome spike sequences in each person matched one haplotype that predominated throughout the infection. By contrast, people with advanced HIV showed elevated intra-host spike diversity with a median of 46 haplotypes per person (IQR 14-114). Higher intra-host spike diversity immediately after COVID-19 symptom onset predicted longer SARS-CoV-2 RNA shedding among PWH, and intra-host spike diversity at this timepoint was significantly higher in people with advanced HIV than in PWOH. Composition of spike sequence populations in people with advanced HIV fluctuated rapidly over time, with founder sequences often replaced by groups of new haplotypes. These population-level changes were associated with a high total burden of intra-host mutations and positive selection at functionally important residues. In several cases, delayed emergence of detectable serum binding to spike was associated with positive selection for presumptive antibody-escape mutations. Taken together, our findings show remarkable intra-host genetic diversity of SARS-CoV-2 in advanced HIV infection and suggest that adaptive intra-host SARS-CoV-2 evolution in this setting may contribute to the emergence of new variants of concern (VOCs).

12.
Mol Phylogenet Evol ; 63(3): 702-13, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22387211

RESUMO

Phylogenetic relationships within sponge classes are highly debated. The low phylogenetic signal observed with some current molecular data can be attributed to the use of few markers, usually slowly-evolving, such as the nuclear rDNA genes and the mitochondrial COI gene. In this study, we conducted a bioinformatics search for a new molecular marker. We sought a marker that (1) is likely to have no paralogs; (2) evolves under a fast evolutionary rate; (3) is part of a continuous exonic region; and (4) is flanked by conserved regions. Our search suggested the nuclear ALG11 as a potential suitable marker. We next demonstrated that this marker can indeed be used for solving phylogenetic relationships within sponges. Specifically, we successfully amplified the ALG11 gene from DNA samples of representatives from all four sponge classes as well as from several cnidarian classes. We also amplified the 18S rDNA and the COI gene for these species. Finally, we analyzed the phylogenetic performance of ALG11 to solve sponge relationships compared to and in combination with the nuclear 18S rDNA and the COI mtDNA genes. Interestingly, the ALG11 marker seems to be superior to the widely-used COI marker. Our work thus indicates that the ALG11 marker is a relevant marker which can complement and corroborate the phylogenetic inferences observed with nuclear ribosomal genes. This marker is also expected to contribute to resolving evolutionary relationships of other apparently slow-evolving animal phyla, such as cnidarians.


Assuntos
Complexo IV da Cadeia de Transporte de Elétrons/genética , Manosiltransferases/genética , Poríferos/genética , RNA Ribossômico 18S/genética , Animais , Teorema de Bayes , Marcadores Genéticos , Funções Verossimilhança , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência , Análise de Sequência de DNA
13.
Hum Genomics ; 5(6): 709-17, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22155609

RESUMO

Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org). This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot) for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.


Assuntos
Bases de Dados Genéticas , Genes/genética , Genoma Humano , Genômica , Biologia Computacional , Humanos
14.
Front Genet ; 13: 991249, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36159983

RESUMO

Nucleotide substitutions in protein-coding genes can be divided into synonymous (S) and non-synonymous (N) ones that alter amino acids (including nonsense mutations causing stop codons). The S substitutions are expected to have little effect on function. The N substitutions almost always are affected by strong purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases can modulate the deleterious effect of single N substitutions and, thus, could be subjected to the positive selection. This effect has been demonstrated for mutations in the serine codons, stop codons and double N substitutions in prokaryotes. In all abovementioned cases, a novel technique was applied that allows elucidating the effects of selection on double substitutions considering mutational biases. Here, we applied the same technique to study double N substitutions in eukaryotic lineages of primates and yeast. We identified markedly fewer cases of purifying selection relative to prokaryotes and no evidence of codon double substitutions under positive selection. This is consistent with previous studies of serine codons in primates and yeast. In general, the obtained results strongly suggest that there are major differences between studied pro- and eukaryotes; double substitutions in primates and yeasts largely reflect mutational biases and are not hallmarks of selection. This is especially important in the context of detection of positive selection in codons because it has been suggested that multiple mutations in codons cause false inferences of lineage-specific site positive selection. It is likely that this concern is applicable to previously studied prokaryotes but not to primates and yeasts where markedly fewer double substitutions are affected by positive selection.

15.
Mol Biol Evol ; 27(2): 441-51, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19864469

RESUMO

Insertions and deletions (indels) are considered to be rare evolutionary events, the analysis of which may resolve controversial phylogenetic relationships. Indeed, indel characters are often assumed to be less homoplastic than amino acid and nucleotide substitutions and, consequently, more reliable markers for phylogenetic reconstruction. In this study, we analyzed indels from over 1,000 metazoan orthologous genes. We studied the impact of different species sampling, ortholog data sets, lengths of included indels, and indel-coding methods on the resulting metazoan tree. Our results show that, similar to sequence substitutions, indels are homoplastic characters, and their analysis is sensitive to the long-branch attraction artifact. Furthermore, improving the taxon sampling and choosing a closely related outgroup greatly impact the phylogenetic inference. Our indel-based inferences support the Ecdysozoa hypothesis over the Coelomata hypothesis and suggest that sponges are a sister clade to other animals.


Assuntos
Mutação INDEL/genética , Invertebrados/classificação , Invertebrados/genética , Filogenia , Sequência de Aminoácidos , Animais , Evolução Molecular , Dados de Sequência Molecular , Poríferos/classificação , Poríferos/genética , Proteínas/química , Proteínas/classificação , Proteínas/genética , Homologia de Sequência de Aminoácidos
16.
Bioinformatics ; 26(22): 2914-5, 2010 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-20876605

RESUMO

UNLABELLED: The evolutionary analysis of presence and absence profiles (phyletic patterns) is widely used in biology. It is assumed that the observed phyletic pattern is the result of gain and loss dynamics along a phylogenetic tree. Examples of characters that are represented by phyletic patterns include restriction sites, gene families, introns and indels, to name a few. Here, we present a user-friendly web server that accurately infers branch-specific and site-specific gain and loss events. The novel inference methodology is based on a stochastic mapping approach utilizing models that reliably capture the underlying evolutionary processes. A variety of features are available including the ability to analyze the data with various evolutionary models, to infer gain and loss events using either stochastic mapping or maximum parsimony, and to estimate gain and loss rates for each character analyzed. AVAILABILITY: Freely available for use at http://gloome.tau.ac.il/.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Software , Filogenia
17.
Cancers (Basel) ; 11(2)2019 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-30759888

RESUMO

Cancer genomes accumulate nucleotide sequence variations that number in the tens of thousands per genome. A prominent fraction of these mutations is thought to arise as a consequence of the off-target activity of DNA/RNA editing cytosine deaminases. These enzymes, collectively called activation induced deaminase (AID)/APOBECs, deaminate cytosines located within defined DNA sequence contexts. The resulting changes of the original C:G pair in these contexts (mutational signatures) provide indirect evidence for the participation of specific cytosine deaminases in a given cancer type. The conventional method used for the analysis of mutable motifs is the consensus approach. Here, for the first time, we have adopted the frequently used weight matrix (sequence profile) approach for the analysis of mutagenesis and provide evidence for this method being a more precise descriptor of mutations than the sequence consensus approach. We confirm that while mutational footprints of APOBEC1, APOBEC3A, APOBEC3B, and APOBEC3G are prominent in many cancers, mutable motifs characteristic of the action of the humoral immune response somatic hypermutation enzyme, AID, are the most widespread feature of somatic mutation spectra attributable to deaminases in cancer genomes. Overall, the weight matrix approach reveals that somatic mutations are significantly associated with at least one AID/APOBEC mutable motif in all studied cancers.

18.
Sci Rep ; 8(1): 9260, 2018 06 18.
Artigo em Inglês | MEDLINE | ID: mdl-29915293

RESUMO

Modes of evolution of stop codons in protein-coding genes, especially the conservation of UAA, have been debated for many years. We reconstructed the evolution of stop codons in 40 groups of closely related prokaryotic and eukaryotic genomes. The results indicate that the UAA codons are maintained by purifying selection in all domains of life. In contrast, positive selection appears to drive switches from UAG to other stop codons in prokaryotes but not in eukaryotes. Changes in stop codons are significantly associated with increased substitution frequency immediately downstream of the stop. These positions are otherwise more strongly conserved in evolution compared to sites farther downstream, suggesting that such substitutions are compensatory. Although GC content has a major impact on stop codon frequencies, its contribution to the decreased frequency of UAA differs between bacteria and archaea, presumably, due to differences in their translation termination mechanisms.


Assuntos
Códon de Terminação/genética , Evolução Molecular , Seleção Genética , Composição de Bases/genética , Escherichia coli/genética , Células Eucarióticas/metabolismo , Genoma , Filogenia , Células Procarióticas/metabolismo
19.
Sci Rep ; 7(1): 12422, 2017 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-28963504

RESUMO

Reconstruction of the evolution of start codons in 36 groups of closely related bacterial and archaeal genomes reveals purifying selection affecting AUG codons. The AUG starts are replaced by GUG and especially UUG significantly less frequently than expected under the neutral expectation derived from the frequencies of the respective nucleotide triplet substitutions in non-coding regions and in 4-fold degenerate sites. Thus, AUG is the optimal start codon that is actively maintained by purifying selection. However, purifying selection on start codons is significantly weaker than the selection on the same codons in coding sequences, although the switches between the codons result in conservative amino acid substitutions. The only exception is the AUG to UUG switch that is strongly selected against among start codons. Selection on start codons is most pronounced in evolutionarily conserved, highly expressed genes. Mutation of the start codon to a sub-optimal form (GUG or UUG) tends to be compensated by mutations in the Shine-Dalgarno sequence towards a stronger translation initiation signal. Together, all these findings indicate that in prokaryotes, translation start signals are subject to weak but significant selection for maximization of initiation rate and, consequently, protein production.


Assuntos
Códon de Iniciação/genética , Genoma Arqueal/genética , Genoma Bacteriano/genética , RNA Mensageiro/genética , Seleção Genética/genética , Escherichia coli/genética , Mutação
20.
Artigo em Inglês | MEDLINE | ID: mdl-25725062

RESUMO

The study of biological pathways is key to a large number of systems analyses. However, many relevant tools consider a limited number of pathway sources, missing out on many genes and gene-to-gene connections. Simply pooling several pathways sources would result in redundancy and the lack of systematic pathway interrelations. To address this, we exercised a combination of hierarchical clustering and nearest neighbor graph representation, with judiciously selected cutoff values, thereby consolidating 3215 human pathways from 12 sources into a set of 1073 SuperPaths. Our unification algorithm finds a balance between reducing redundancy and optimizing the level of pathway-related informativeness for individual genes. We show a substantial enhancement of the SuperPaths' capacity to infer gene-to-gene relationships when compared with individual pathway sources, separately or taken together. Further, we demonstrate that the chosen 12 sources entail nearly exhaustive gene coverage. The computed SuperPaths are presented in a new online database, PathCards, showing each SuperPath, its constituent network of pathways, and its contained genes. This provides researchers with a rich, searchable systems analysis resource. Database URL: http://pathcards.genecards.org/


Assuntos
Vias Biossintéticas/fisiologia , Bases de Dados Genéticas , Epistasia Genética/fisiologia , Redes Reguladoras de Genes/fisiologia , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA