Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 139
Filtrar
1.
Genome Biol Evol ; 16(4)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38597156

RESUMO

De novo genes emerge from previously noncoding stretches of the genome. Their encoded de novo proteins are generally expected to be similar to random sequences and, accordingly, with no stable tertiary fold and high predicted disorder. However, structural properties of de novo proteins and whether they differ during the stages of emergence and fixation have not been studied in depth and rely heavily on predictions. Here we generated a library of short human putative de novo proteins of varying lengths and ages and sorted the candidates according to their structural compactness and disorder propensity. Using Förster resonance energy transfer combined with Fluorescence-activated cell sorting, we were able to screen the library for most compact protein structures, as well as most elongated and flexible structures. We find that compact de novo proteins are on average slightly shorter and contain lower predicted disorder than less compact ones. The predicted structures for most and least compact de novo proteins correspond to expectations in that they contain more secondary structure content or higher disorder content, respectively. Our experiments indicate that older de novo proteins have higher compactness and structural propensity compared with young ones. We discuss possible evolutionary scenarios and their implications underlying the age-dependencies of compactness and structural content of putative de novo proteins.


Assuntos
Dobramento de Proteína , Proteínas , Humanos , Proteínas/genética , Estrutura Secundária de Proteína , Biblioteca Gênica
2.
Nucleic Acids Res ; 52(1): 274-287, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38000384

RESUMO

Most of the transcribed eukaryotic genomes are composed of non-coding transcripts. Among these transcripts, some are newly transcribed when compared to outgroups and are referred to as de novo transcripts. De novo transcripts have been shown to play a major role in genomic innovations. However, little is known about the rates at which de novo transcripts are gained and lost in individuals of the same species. Here, we address this gap and estimate the de novo transcript turnover rate with an evolutionary model. We use DNA long reads and RNA short reads from seven geographically remote samples of inbred individuals of Drosophila melanogaster to detect de novo transcripts that are gained on a short evolutionary time scale. Overall, each sampled individual contains around 2500 unspliced de novo transcripts, with most of them being sample specific. We estimate that around 0.15 transcripts are gained per year, and that each gained transcript is lost at a rate around 5× 10-5 per year. This high turnover of transcripts suggests frequent exploration of new genomic sequences within species. These rate estimates are essential to comprehend the process and timescale of de novo gene birth.


Assuntos
Drosophila melanogaster , Evolução Molecular , RNA não Traduzido , Transcrição Gênica , Animais , Humanos , Evolução Biológica , Drosophila melanogaster/genética , Genoma , Genômica , RNA , RNA não Traduzido/química , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Geografia
3.
iScience ; 26(10): 107832, 2023 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-37829199

RESUMO

Live birth (viviparity) has arisen repeatedly and independently among animals. We sequenced the genome and transcriptome of the viviparous Pacific beetle-mimic cockroach and performed comparative analyses with two other viviparous insect lineages, tsetse flies and aphids, to unravel the basis underlying the transition to viviparity in insects. We identified pathways undergoing adaptive evolution for insects, involved in urogenital remodeling, tracheal system, heart development, and nutrient metabolism. Transcriptomic analysis of cockroach and tsetse flies revealed that uterine remodeling and nutrient production are increased and the immune response is altered during pregnancy, facilitating structural and physiological changes to accommodate and nourish the progeny. These patterns of convergent evolution of viviparity among insects, together with similar adaptive mechanisms identified among vertebrates, highlight that the transition to viviparity requires changes in urogenital remodeling, enhanced tracheal and heart development (corresponding to angiogenesis in vertebrates), altered nutrient metabolism, and shifted immunity in animal systems.

4.
Mol Ecol Resour ; 23(7): 1706-1723, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37489282

RESUMO

Genome sequencing enables answering fundamental questions about the genetic basis of adaptation, population structure and epigenetic mechanisms. Yet, we usually need a suitable reference genome for mapping population-level resequencing data. In some model systems, multiple reference genomes are available, giving the challenging task of determining which reference genome best suits the data. Here, we compared the use of two different reference genomes for the three-spined stickleback (Gasterosteus aculeatus), one novel genome derived from a European gynogenetic individual and the published reference genome of a North American individual. Specifically, we investigated the impact of using a local reference versus one generated from a distinct lineage on several common population genomics analyses. Through mapping genome resequencing data of 60 sticklebacks from across Europe and North America, we demonstrate that genetic distance among samples and the reference genomes impacts downstream analyses. Using a local reference genome increased mapping efficiency and genotyping accuracy, effectively retaining more and better data. Despite comparable distributions of the metrics generated across the genome using SNP data (i.e. π, Tajima's D and FST ), window-based statistics using different references resulted in different outlier genes and enriched gene functions. A marker-based analysis of DNA methylation distributions had a comparably high overlap in outlier genes and functions, yet with distinct differences depending on the reference genome. Overall, our results highlight how using a local reference genome decreases reference bias to increase confidence in downstream analyses of the data. Such results have significant implications in all reference-genome-based population genomic analyses.


Assuntos
Metagenômica , Smegmamorpha , Animais , Genoma/genética , Mapeamento Cromossômico , Genômica/métodos , Análise de Sequência de DNA/métodos , Smegmamorpha/genética
5.
Genome Res ; 33(6): 872-890, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37442576

RESUMO

Novel genes are essential for evolutionary innovations and differ substantially even between closely related species. Recently, multiple studies across many taxa showed that some novel genes arise de novo, that is, from previously noncoding DNA. To characterize the underlying mutations that allowed de novo gene emergence and their order of occurrence, homologous regions must be detected within noncoding sequences in closely related sister genomes. So far, most studies do not detect noncoding homologs of de novo genes because of incomplete assemblies and annotations, and long evolutionary distances separating genomes. Here, we overcome these issues by searching for de novo expressed open reading frames (neORFs), the not-yet fixed precursors of de novo genes that emerged within a single species. We sequenced and assembled genomes with long-read technology and the corresponding transcriptomes from inbred lines of Drosophila melanogaster, derived from seven geographically diverse populations. We found line-specific neORFs in abundance but few neORFs shared by lines, suggesting a rapid turnover. Gain and loss of transcription is more frequent than the creation of ORFs, for example, by forming new start and stop codons. Consequently, the gain of ORFs becomes rate limiting and is frequently the initial step in neORFs emergence. Furthermore, transposable elements (TEs) are major drivers for intragenomic duplications of neORFs, yet TE insertions are less important for the emergence of neORFs. However, highly mutable genomic regions around TEs provide new features that enable gene birth. In conclusion, neORFs have a high birth-death rate, are rapidly purged, but surviving neORFs spread neutrally through populations and within genomes.


Assuntos
Drosophila melanogaster , Metagenômica , Animais , Drosophila melanogaster/genética , Fases de Leitura Aberta , Elementos de DNA Transponíveis/genética , Evolução Biológica , Evolução Molecular
6.
Mol Biol Evol ; 40(4)2023 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-37011142

RESUMO

New protein coding genes can emerge from genomic regions that previously did not contain any genes, via a process called de novo gene emergence. To synthesize a protein, DNA must be transcribed as well as translated. Both processes need certain DNA sequence features. Stable transcription requires promoters and a polyadenylation signal, while translation requires at least an open reading frame. We develop mathematical models based on mutation probabilities, and the assumption of neutral evolution, to find out how quickly genes emerge and are lost. We also investigate the effect of the order by which DNA features evolve, and if sequence composition is biased by mutation rate. We rationalize how genes are lost much more rapidly than they emerge, and how they preferentially arise in regions that are already transcribed. Our study not only answers some fundamental questions on the topic of de novo emergence but also provides a modeling framework for future studies.


Assuntos
Evolução Molecular , Genômica , Mutação , Fases de Leitura Aberta , Genoma
7.
F1000Res ; 12: 347, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37113259

RESUMO

Background: De novo protein coding genes emerge from scratch in the non-coding regions of the genome and have, per definition, no homology to other genes. Therefore, their encoded de novo proteins belong to the so-called "dark protein space". So far, only four de novo protein structures have been experimentally approximated. Low homology, presumed high disorder and limited structures result in low confidence structural predictions for de novo proteins in most cases. Here, we look at the most widely used structure and disorder predictors and assess their applicability for de novo emerged proteins. Since AlphaFold2 is based on the generation of multiple sequence alignments and was trained on solved structures of largely conserved and globular proteins, its performance on de novo proteins remains unknown. More recently, natural language models of proteins have been used for alignment-free structure predictions, potentially making them more suitable for de novo proteins than AlphaFold2. Methods: We applied different disorder predictors (IUPred3 short/long, flDPnn) and structure predictors, AlphaFold2 on the one hand and language-based models (Omegafold, ESMfold, RGN2) on the other hand, to four de novo proteins with experimental evidence on structure. We compared the resulting predictions between the different predictors as well as to the existing experimental evidence. Results: Results from IUPred, the most widely used disorder predictor, depend heavily on the choice of parameters and differ significantly from flDPnn which has been found to outperform most other predictors in a comparative assessment study recently. Similarly, different structure predictors yielded varying results and confidence scores for de novo proteins. Conclusions: We suggest that, while in some cases protein language model based approaches might be more accurate than AlphaFold2, the structure prediction of de novo emerged proteins remains a difficult task for any predictor, be it disorder or structure.


Assuntos
Genoma , Proteínas , Proteínas/química , Alinhamento de Sequência , Aprendizado de Máquina
8.
Nat Ecol Evol ; 7(4): 570-580, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37024625

RESUMO

De novo gene emergence provides a route for new proteins to be formed from previously non-coding DNA. Proteins born in this way are considered random sequences and typically assumed to lack defined structure. While it remains unclear how likely a de novo protein is to assume a soluble and stable tertiary structure, intersecting evidence from random sequence and de novo-designed proteins suggests that native-like biophysical properties are abundant in sequence space. Taking putative de novo proteins identified in human and fly, we experimentally characterize a library of these sequences to assess their solubility and structure propensity. We compare this library to a set of synthetic random proteins with no evolutionary history. Bioinformatic prediction suggests that de novo proteins may have remarkably similar distributions of biophysical properties to unevolved random sequences of a given length and amino acid composition. However, upon expression in vitro, de novo proteins exhibit moderately higher solubility which is further induced by the DnaK chaperone system. We suggest that while synthetic random sequences are a useful proxy for de novo proteins in terms of structure propensity, de novo proteins may be better integrated in the cellular system than random expectation, given their higher solubility.


Assuntos
Proteínas , Proteômica , Humanos , Proteínas/química , Biologia Computacional
10.
Mol Ecol ; 32(2): 369-380, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36320186

RESUMO

Transposable elements (TEs) are mobile genetic sequences, which can cause the accumulation of genomic damage in the lifetime of an organism. The regulation of TEs, for instance via the piRNA-pathway, is an important mechanism to protect the integrity of genomes, especially in the germ-line where mutations can be transmitted to offspring. In eusocial insects, soma and germ-line are divided among worker and reproductive castes, so one may expect caste-specific differences in TE regulation to exist. To test this, we compared whole-genome levels of repeat element transcription in the fat body of female workers, kings and five different queen stages of the higher termite, Macrotermes natalensis. In this species, queens can live over 20 years, maintaining near maximum reproductive output, while sterile workers only live weeks. We found a strong, positive correlation between TE expression and the expression of neighbouring genes in all castes. However, we found substantially higher TE activity in workers than in reproductives. Furthermore, TE expression did not increase with age in queens, despite a sevenfold increase in overall gene expression, due to a significant upregulation of the piRNA-pathway in 20-year-old queens. Our results suggest a caste- and age-specific regulation of the piRNA-pathway has evolved in higher termites that is analogous to germ-line-specific activity in solitary organisms. In the fat body of these termite queens, an important metabolic tissue for maintaining their extreme longevity and reproductive output, an efficient regulation of TEs likely protects genome integrity, thus further promoting reproductive fitness even at high age.


Assuntos
Isópteros , Animais , Feminino , Isópteros/genética , Insetos , Fertilidade , Reprodução/genética , Longevidade
11.
Genes (Basel) ; 13(11)2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-36360186

RESUMO

(1) Unravelling the molecular basis underlying major evolutionary transitions can shed light on how complex phenotypes arise. The evolution of eusociality, a major evolutionary transition, has been demonstrated to be accompanied by enhanced gene regulation. Numerous pieces of evidence suggest the major impact of transposon insertion on gene regulation and its role in adaptive evolution. Transposons have been shown to be play a role in gene duplication involved in the eusocial transition in termites. However, evidence of the molecular basis underlying the eusocial transition in Blattodea remains scarce. Could transposons have facilitated the eusocial transition in termites through shifts of gene expression? (2) Using available cockroach and termite genomes and transcriptomes, we investigated if transposons insert more frequently in genes with differential expression in queens and workers and if those genes could be linked to specific functions essential for eusocial transition. (3) The insertion rate of transposons differs among differentially expressed genes and displays opposite trends between termites and cockroaches. The functions of termite transposon-rich queen- and worker-biased genes are related to reproduction and ageing and behaviour and gene expression, respectively. (4) Our study provides further evidence on the role of transposons in the evolution of eusociality, potentially through shifts in gene expression.


Assuntos
Baratas , Isópteros , Animais , Baratas/genética , Elementos de DNA Transponíveis/genética , Comportamento Social , Isópteros/genética , Expressão Gênica
12.
J Mol Evol ; 90(6): 418-428, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36181519

RESUMO

Vertebrate blood coagulation is controlled by a cascade containing more than 20 proteins. The cascade proteins are found in the blood in their zymogen forms and when the cascade is triggered by tissue damage, zymogens are activated and in turn activate their downstream proteins by serine protease activity. In this study, we examined proteomes of 21 chordates, of which 18 are vertebrates, to reveal the modular evolution of the blood coagulation cascade. Additionally, two Arthropoda species were used to compare domain arrangements of the proteins belonging to the hemolymph clotting and the blood coagulation cascades. Within the vertebrate coagulation protein set, almost half of the studied proteins are shared with jawless vertebrates. Domain similarity analyses revealed that there are multiple possible evolutionary trajectories for each coagulation protein. During the evolution of higher vertebrate clades, gene and genome duplications led to the formation of other coagulation cascade proteins.


Assuntos
Fatores de Coagulação Sanguínea , Cordados , Animais , Fatores de Coagulação Sanguínea/genética , Fatores de Coagulação Sanguínea/metabolismo , Vertebrados/genética , Coagulação Sanguínea/genética , Cordados/genética , Genoma
13.
Microb Biotechnol ; 15(11): 2845-2853, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36099491

RESUMO

Directed evolution (DE) is a widely used method for improving the function of biomolecules via multiple rounds of mutation and selection. Microfluidic droplets have emerged as an important means to screen the large libraries needed for DE, but this approach was so far partially limited by the need to lyse cells, recover DNA, and retransform into cells for the next round, necessitating the use of a high-copy number plasmid or oversampling. The recently developed live cell recovery avoids some of these limitations by directly regrowing selected cells after sorting. However, repeated sorting cycles used to further enrich the most active variants ultimately resulted in unfavourable recovery of empty plasmid vector-containing cells over those expressing the protein of interest. In this study, we found that engineering of the original expression vector solved the problem of false positives (i.e. plasmids lacking an insert) cells containing empty vectors. Five approaches to measure activity of cell-displayed enzymes in microdroplets were compared. By comparing various cell treatment methods prior to droplet sorting two things were found. Substrate encapsulation from the start, that is prior to expression of enzyme, showed no disadvantage to post-induction substrate addition by pico-injection with respect to recovery of true positive variants. Furthermore in-droplet cell growth prior to induction of enzyme production improves the total amount of cells retrieved (recovery) and proportion of true positive variants (enrichment) after droplet sorting.


Assuntos
Escherichia coli , Microfluídica , Escherichia coli/metabolismo , Plasmídeos , Microfluídica/métodos , Vetores Genéticos , Mutação
14.
Mol Ecol ; 31(19): 4991-5004, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35920076

RESUMO

The ecological success of social Hymenoptera (ants, bees, wasps) depends on the division of labour between the queen and workers. Each caste exhibits highly specialized morphology, behaviour, and life-history traits, such as lifespan and fecundity. Despite strong defences against alien intruders, insect societies are vulnerable to social parasites, such as workerless inquilines or slave-making ants. Here, we investigate whether gene expression varies in parallel ways between lifestyles (slave-making versus host ants) across five independent origins of ant slavery in the "Formicoxenus-group" of the ant tribe Crematogastrini. As caste differences are often less pronounced in slave-making ants than in nonparasitic ants, we also compare caste-specific gene expression patterns between lifestyles. We demonstrate a substantial overlap in expression differences between queens and workers across taxa, irrespective of lifestyle. Caste affects the transcriptomes much more profoundly than lifestyle, as indicated by 37 times more genes being linked to caste than to lifestyle and by multiple caste-associated modules of coexpressed genes with strong connectivity. However, several genes and one gene module are linked to slave-making across the independent origins of this parasitic lifestyle, pointing to some evolutionary convergence. Finally, we do not find evidence for an interaction between caste and lifestyle, indicating that caste differences in gene expression remain consistent even when species switch to a parasitic lifestyle. Our findings strongly support the existence of a core set of genes whose expression is linked to the queen and worker caste in this ant taxon, as proposed by the "genetic toolkit" hypothesis.


Assuntos
Formigas , Características de História de Vida , Animais , Formigas/genética , Abelhas/genética , Comportamento Animal , Evolução Biológica , Transcriptoma/genética
15.
Protein Sci ; 31(8): e4371, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35900020

RESUMO

Over the past decade, evidence has accumulated that new protein-coding genes can emerge de novo from previously non-coding DNA. Most studies have focused on large scale computational predictions of de novo protein-coding genes across a wide range of organisms. In contrast, experimental data concerning the folding and function of de novo proteins are scarce. This might be due to difficulties in handling de novo proteins in vitro, as most are short and predicted to be disordered. Here, we propose a guideline for the effective expression of eukaryotic de novo proteins in Escherichia coli. We used 11 sequences from Drosophila melanogaster and 10 from Homo sapiens, that are predicted de novo proteins from former studies, for heterologous expression. The candidate de novo proteins have varying secondary structure and disorder content. Using multiple combinations of purification tags, E. coli expression strains, and chaperone systems, we were able to increase the number of solubly expressed putative de novo proteins from 30% to 62%. Our findings indicate that the best combination for expressing putative de novo proteins in E. coli is a GST-tag with T7 Express cells and co-expressed chaperones. We found that, overall, proteins with higher predicted disorder were easier to express. STATEMENT: Today, we know that proteins do not only evolve by duplication and divergence of existing proteins but also arise from previously non-coding DNA. These proteins are called de novo proteins. Their properties are still poorly understood and their experimental analysis faces major obstacles. Here, we aim to present a starting point for soluble expression of de novo proteins with the help of chaperones and thereby enable further characterization.


Assuntos
Drosophila melanogaster , Escherichia coli , Animais , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Células Eucarióticas/metabolismo , Chaperonas Moleculares/genética , Chaperonas Moleculares/metabolismo , Estrutura Secundária de Proteína
16.
Genes (Basel) ; 13(2)2022 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-35205330

RESUMO

De novo genes are novel genes which emerge from non-coding DNA. Until now, little is known about de novo genes' properties, correlated to their age and mechanisms of emergence. In this study, we investigate four related properties: introns, upstream regulatory motifs, 5' Untranslated regions (UTRs) and protein domains, in 23,135 human proto-genes. We found that proto-genes contain introns, whose number and position correlates with the genomic position of proto-gene emergence. The origin of these introns is debated, as our results suggest that 41% of proto-genes might have captured existing introns, and 13.7% of them do not splice the ORF. We show that proto-genes which emerged via overprinting tend to be more enriched in core promotor motifs, while intergenic and intronic genes are more enriched in enhancers, even if the TATA motif is most commonly found upstream in these genes. Intergenic and intronic 5' UTRs of proto-genes have a lower potential to stabilise mRNA structures than exonic proto-genes and established human genes. Finally, we confirm that proteins expressed by proto-genes gain new putative domains with age. Overall, we find that regulatory motifs inducing transcription and translation of previously non-coding sequences may facilitate proto-gene emergence. Our study demonstrates that introns, 5' UTRs, and domains have specific properties in proto-genes. We also emphasize that the genomic positions of de novo genes strongly impacts these properties.


Assuntos
Genômica , Regiões 5' não Traduzidas , Éxons/genética , Humanos , Íntrons/genética , Regiões Promotoras Genéticas
17.
Commun Biol ; 5(1): 44, 2022 01 13.
Artigo em Inglês | MEDLINE | ID: mdl-35027667

RESUMO

Kings and queens of eusocial termites can live for decades, while queens sustain a nearly maximal fertility. To investigate the molecular mechanisms underlying their long lifespan, we carried out transcriptomics, lipidomics and metabolomics in Macrotermes natalensis on sterile short-lived workers, long-lived kings and five stages spanning twenty years of adult queen maturation. Reproductives share gene expression differences from workers in agreement with a reduction of several aging-related processes, involving upregulation of DNA damage repair and mitochondrial functions. Anti-oxidant gene expression is downregulated, while peroxidability of membranes in queens decreases. Against expectations, we observed an upregulated gene expression in fat bodies of reproductives of several components of the IIS pathway, including an insulin-like peptide, Ilp9. This pattern does not lead to deleterious fat storage in physogastric queens, while simple sugars dominate in their hemolymph and large amounts of resources are allocated towards oogenesis. Our findings support the notion that all processes causing aging need to be addressed simultaneously in order to prevent it.


Assuntos
Envelhecimento , Reparo do DNA , Insulina/fisiologia , Isópteros/fisiologia , Animais , Fertilidade , Longevidade , Reprodução , Regulação para Cima
18.
Mol Biol Evol ; 39(1)2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34668533

RESUMO

The evolution of an obligate parasitic lifestyle often leads to the reduction of morphological and physiological traits, which may be accompanied by loss of genes and functions. Slave-making ants are social parasites that exploit the work force of closely related ant species for social behaviors such as brood care and foraging. Recent divergence between these social parasites and their hosts enables comparative studies of gene family evolution. We sequenced the genomes of eight ant species, representing three independent origins of ant slavery. During the evolution of eusociality, chemoreceptor genes multiplied due to the importance of chemical communication in insect societies. We investigated the evolutionary fate of these chemoreceptors and found that slave-making ant genomes harbored only half as many gustatory receptors as their hosts', potentially mirroring the outsourcing of foraging tasks to host workers. In addition, parasites had fewer odorant receptors and their loss shows striking patterns of convergence across independent origins of parasitism, in particular in orthologs often implicated in sociality like the 9-exon odorant receptors. These convergent losses represent a rare case of convergent molecular evolution at the level of individual genes. Thus, evolution can operate in a way that is both repeatable and reversible when independent ant lineages lose important social traits during the transition to a parasitic lifestyle.


Assuntos
Formigas , Receptores Odorantes , Animais , Formigas/genética , Comportamento Animal/fisiologia , Evolução Molecular , Receptores Odorantes/genética , Comportamento Social
19.
J R Soc Interface ; 18(184): 20210389, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34727710

RESUMO

Evolutionary relationships of protein families can be characterized either by networks or by trees. Whereas trees allow for hierarchical grouping and reconstruction of the most likely ancestral sequences, networks lack a time axis but allow for thresholds of pairwise sequence identity to be chosen and, therefore, the clustering of family members with presumably more similar functions. Here, we use the large family of arylsulfatases and phosphonate monoester hydrolases to investigate similarities, strengths and weaknesses in tree and network representations. For varying thresholds of pairwise sequence identity, values of betweenness centrality and clustering coefficients were derived for nodes of the reconstructed ancestors to measure the propensity to act as a bridge in a network. Based on these properties, ancestral protein sequences emerge as bridges in protein sequence networks. Interestingly, many ancestral protein sequences appear close to extant sequences. Therefore, reconstructed ancestor sequences might also be interpreted as yet-to-be-identified homologues. The concept of ancestor reconstruction is compared to consensus sequences, too. It was found that hub sequences in a network, e.g. reconstructed ancestral sequences that are connected to many neighbouring sequences, share closer similarity with derived consensus sequences. Therefore, some reconstructed ancestor sequences can also be interpreted as consensus sequences.


Assuntos
Evolução Molecular , Proteínas , Sequência de Aminoácidos , Evolução Biológica , Filogenia
20.
PLoS Genet ; 17(9): e1009787, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34478447

RESUMO

Comparative genomics has enabled the identification of genes that potentially evolved de novo from non-coding sequences. Many such genes are expressed in male reproductive tissues, but their functions remain poorly understood. To address this, we conducted a functional genetic screen of over 40 putative de novo genes with testis-enriched expression in Drosophila melanogaster and identified one gene, atlas, required for male fertility. Detailed genetic and cytological analyses showed that atlas is required for proper chromatin condensation during the final stages of spermatogenesis. Atlas protein is expressed in spermatid nuclei and facilitates the transition from histone- to protamine-based chromatin packaging. Complementary evolutionary analyses revealed the complex evolutionary history of atlas. The protein-coding portion of the gene likely arose at the base of the Drosophila genus on the X chromosome but was unlikely to be essential, as it was then lost in several independent lineages. Within the last ~15 million years, however, the gene moved to an autosome, where it fused with a conserved non-coding RNA and evolved a non-redundant role in male fertility. Altogether, this study provides insight into the integration of novel genes into biological processes, the links between genomic innovation and functional evolution, and the genetic control of a fundamental developmental process, gametogenesis.


Assuntos
Cromatina/metabolismo , Drosophila melanogaster/genética , Evolução Molecular , Espermátides/metabolismo , Animais , Núcleo Celular/metabolismo , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Fertilidade/genética , Masculino , Interferência de RNA , Espermatogênese/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA