Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Plant J ; 117(3): 944-955, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37947292

RESUMO

Scots pine (Pinus sylvestris L.) is one of the most widespread and economically important conifer species in the world. Applications like genomic selection and association studies, which could help accelerate breeding cycles, are challenging in Scots pine because of its large and repetitive genome. For this reason, genotyping tools for conifer species, and in particular for Scots pine, are commonly based on transcribed regions of the genome. In this article, we present the Axiom Psyl50K array, the first single nucleotide polymorphism (SNP) genotyping array for Scots pine based on whole-genome resequencing, that represents both genic and intergenic regions. This array was designed following a two-step procedure: first, 192 trees were sequenced, and a 430K SNP screening array was constructed. Then, 480 samples, including haploid megagametophytes, full-sib family trios, breeding population, and range-wide individuals from across Eurasia were genotyped with the screening array. The best 50K SNPs were selected based on quality, replicability, distribution across the draft genome assembly, balance between genic and intergenic regions, and genotype-environment and genotype-phenotype associations. Of the final 49 877 probes tiled in the array, 20 372 (40.84%) occur inside gene models, while the rest lie in intergenic regions. We also show that the Psyl50K array can yield enough high-confidence SNPs for genetic studies in pine species from North America and Eurasia. This new genotyping tool will be a valuable resource for high-throughput fundamental and applied research of Scots pine and other pine species.


Assuntos
Pinus sylvestris , Pinus , Humanos , Pinus sylvestris/genética , Polimorfismo de Nucleotídeo Único/genética , Genótipo , Melhoramento Vegetal , Pinus/genética , DNA Intergênico
2.
Commun Biol ; 6(1): 139, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36732562

RESUMO

Ipsilateral breast tumor recurrence (IBTR) is a clinically important event, where an isolated in-breast recurrence is a potentially curable event but associated with an increased risk of distant metastasis and breast cancer death. It remains unclear if IBTRs are associated with molecular changes that can be explored as a resource for precision medicine strategies. Here, we employed proteogenomics to analyze a cohort of 27 primary breast cancers and their matched IBTRs to define proteogenomic determinants of molecular tumor evolution. Our analyses revealed a relationship between hormonal receptors status and proliferation levels resulting in the gain of somatic mutations and copy number. This in turn re-programmed the transcriptome and proteome towards a highly replicating and genomically unstable IBTRs, possibly enhanced by APOBEC3B. In order to investigate the origins of IBTRs, a second analysis that included primaries with no recurrence pinpointed proliferation and immune infiltration as predictive of IBTR. In conclusion, our study shows that breast tumors evolve into different IBTRs depending on hormonal status and proliferation and that immune cell infiltration and Ki-67 are significantly elevated in primary tumors that develop IBTR. These results can serve as a starting point to explore markers to predict IBTR formation and stratify patients for adjuvant therapy.


Assuntos
Neoplasias da Mama , Neoplasias Mamárias Animais , Proteogenômica , Humanos , Animais , Feminino , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Mastectomia Segmentar , Recidiva Local de Neoplasia/genética , Recidiva Local de Neoplasia/patologia , Terapia Combinada , Citidina Desaminase , Antígenos de Histocompatibilidade Menor
3.
Curr Biol ; 32(20): 4360-4371.e6, 2022 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-36087578

RESUMO

Supergenes govern multi-trait-balanced polymorphisms in a wide range of systems; however, our understanding of their origins and evolution remains incomplete. The reciprocal placement of stigmas and anthers in pin and thrum floral morphs of distylous species constitutes an iconic example of a balanced polymorphism governed by a supergene, the distyly S-locus. Recent studies have shown that the Primula and Turnera distyly supergenes are both hemizygous in thrums, but it remains unknown whether hemizygosity is pervasive among distyly S-loci. As hemizygosity has major consequences for supergene evolution and loss, clarifying whether this genetic architecture is shared among distylous species is critical. Here, we have characterized the genetic architecture and evolution of the distyly supergene in Linum by generating a chromosome-level genome assembly of Linum tenue, followed by the identification of the S-locus using population genomic data. We show that hemizygosity and thrum-specific expression of S-linked genes, including a pistil-expressed candidate gene for style length, are major features of the Linum S-locus. Structural variation is likely instrumental for recombination suppression, and although the non-recombining dominant haplotype has accumulated transposable elements, S-linked genes are not under relaxed purifying selection. Our findings reveal remarkable convergence in the genetic architecture and evolution of independently derived distyly supergenes, provide a counterexample to classic inversion-based supergenes, and shed new light on the origin and maintenance of an iconic floral polymorphism.


Assuntos
Linho , Linho/genética , Elementos de DNA Transponíveis , Flores/genética , Genômica , Loci Gênicos , Evolução Molecular
4.
Skelet Muscle ; 12(1): 16, 2022 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-35780170

RESUMO

BACKGROUND: Skeletal muscle fiber type distribution has implications for human health, muscle function, and performance. This knowledge has been gathered using labor-intensive and costly methodology that limited these studies. Here, we present a method based on muscle tissue RNA sequencing data (totRNAseq) to estimate the distribution of skeletal muscle fiber types from frozen human samples, allowing for a larger number of individuals to be tested. METHODS: By using single-nuclei RNA sequencing (snRNAseq) data as a reference, cluster expression signatures were produced by averaging gene expression of cluster gene markers and then applying these to totRNAseq data and inferring muscle fiber nuclei type via linear matrix decomposition. This estimate was then compared with fiber type distribution measured by ATPase staining or myosin heavy chain protein isoform distribution of 62 muscle samples in two independent cohorts (n = 39 and 22). RESULTS: The correlation between the sequencing-based method and the other two were rATPas = 0.44 [0.13-0.67], [95% CI], and rmyosin = 0.83 [0.61-0.93], with p = 5.70 × 10-3 and 2.00 × 10-6, respectively. The deconvolution inference of fiber type composition was accurate even for very low totRNAseq sequencing depths, i.e., down to an average of ~ 10,000 paired-end reads. CONCLUSIONS: This new method ( https://github.com/OlaHanssonLab/PredictFiberType ) consequently allows for measurement of fiber type distribution of a larger number of samples using totRNAseq in a cost and labor-efficient way. It is now feasible to study the association between fiber type distribution and e.g. health outcomes in large well-powered studies.


Assuntos
Fibras Musculares Esqueléticas , RNA , Sequência de Bases , Humanos , Análise de Sequência de RNA , Sequenciamento do Exoma
5.
BMC Bioinformatics ; 23(1): 228, 2022 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-35698034

RESUMO

BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to 'genomic erosion' processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day populations often lack concordance with dwindling population sizes and conservation status of threatened species. One way to directly quantify the genomic consequences of population declines is to compare genome-wide data from pre-decline museum samples and modern samples. However, doing so requires computational data processing and analysis tools specifically adapted to comparative analyses of degraded, ancient or historical, DNA data with modern DNA data as well as personnel trained to perform such analyses. RESULTS: Here, we present a highly flexible, scalable, and modular pipeline to compare patterns of genomic erosion using samples from disparate time periods. The GenErode pipeline uses state-of-the-art bioinformatics tools to simultaneously process whole-genome re-sequencing data from ancient/historical and modern samples, and to produce comparable estimates of several genomic erosion indices. No programming knowledge is required to run the pipeline and all bioinformatic steps are well-documented, making the pipeline accessible to users with different backgrounds. GenErode is written in Snakemake and Python3 and uses Conda and Singularity containers to achieve reproducibility on high-performance compute clusters. The source code is freely available on GitHub ( https://github.com/NBISweden/GenErode ). CONCLUSIONS: GenErode is a user-friendly and reproducible pipeline that enables the standardization of genomic erosion indices from temporally sampled whole genome re-sequencing data.


Assuntos
Biologia Computacional , Genoma , Animais , Espécies em Perigo de Extinção , Genômica , Reprodutibilidade dos Testes , Software
6.
iScience ; 25(5): 104303, 2022 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-35573201

RESUMO

Transgenerational inheritance of environmentally induced epigenetic marks can have significant impacts on eco-evolutionary dynamics, but the phenomenon remains controversial in ecological model systems. We used whole-genome bisulfite sequencing of individual water fleas (Daphnia magna) to assess whether environmentally induced DNA methylation is transgenerationally inherited. Genetically identical females were exposed to one of three natural stressors, or a de-methylating drug, and their offspring were propagated clonally for four generations under control conditions. We identified between 70 and 225 differentially methylated CpG positions (DMPs) in F1 individuals whose mothers were exposed to a natural stressor. Roughly half of these environmentally induced DMPs persisted until generation F4. In contrast, treatment with the drug demonstrated that pervasive hypomethylation upon exposure is reset almost completely after one generation. These results suggest that environmentally induced DNA methylation is non-random and stably inherited across generations in Daphnia, making epigenetic inheritance a putative factor in the eco-evolutionary dynamics of freshwater communities.

7.
Nat Commun ; 13(1): 2532, 2022 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-35534486

RESUMO

Despite the success of genome-wide association studies, much of the genetic contribution to complex traits remains unexplained. Here, we analyse high coverage whole-genome sequencing data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants is skewed towards the rare spectrum, and damaging variants are more often rare. We estimate that less than 4.3% of the narrow-sense heritability is expected to be explained by rare variants in our cohort. Using a gene-based approach, we identify Cis-associations for 237 of the proteins, which is slightly more compared to a GWAS (N = 213), and we identify 34 associated loci in Trans. Several associations are driven by rare variants, which have larger effects, on average. We therefore conclude that rare variants could be of importance for precision medicine applications, but have a more limited contribution to the missing heritability of complex diseases.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Proteínas Sanguíneas/genética , Predisposição Genética para Doença , Variação Genética , Humanos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma
9.
Mol Biol Evol ; 38(12): 5275-5291, 2021 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-34542640

RESUMO

How the avian sex chromosomes first evolved from autosomes remains elusive as 100 million years (My) of divergence and degeneration obscure their evolutionary history. The Sylvioidea group of songbirds is interesting for understanding avian sex chromosome evolution because a chromosome fusion event ∼24 Ma formed "neo-sex chromosomes" consisting of an added (new) and an ancestral (old) part. Here, we report the complete female genome (ZW) of one Sylvioidea species, the great reed warbler (Acrocephalus arundinaceus). Our long-read assembly shows that the added region has been translocated to both Z and W, and whereas the added-Z has retained its gene order the added-W part has been heavily rearranged. Phylogenetic analyses show that recombination between the homologous added-Z and -W regions continued after the fusion event, and that recombination suppression across this region took several million years to be completed. Moreover, recombination suppression was initiated across multiple positions over the added-Z, which is not consistent with a simple linear progression starting from the fusion point. As expected following recombination suppression, the added-W show signs of degeneration including repeat accumulation and gene loss. Finally, we present evidence for nonrandom maintenance of slowly evolving and dosage-sensitive genes on both ancestral- and added-W, a process causing correlated evolution among orthologous genes across broad taxonomic groups, regardless of sex linkage.


Assuntos
Passeriformes , Aves Canoras , Animais , Evolução Molecular , Feminino , Passeriformes/genética , Filogenia , Recombinação Genética , Cromossomos Sexuais/genética , Aves Canoras/genética
10.
Mar Biotechnol (NY) ; 23(3): 402-416, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33931810

RESUMO

Barnacles are key marine crustaceans in several habitats, and they constitute a common practical problem by causing biofouling on man-made marine constructions and ships. Despite causing considerable ecological and economic impacts, there is a surprising void of basic genomic knowledge, and a barnacle reference genome is lacking. We here set out to characterize the genome of the bay barnacle Balanus improvisus (= Amphibalanus improvisus) based on short-read whole-genome sequencing and experimental genome size estimation. We show both experimentally (DNA staining and flow cytometry) and computationally (k-mer analysis) that B. improvisus has a haploid genome size of ~ 740 Mbp. A pilot genome assembly rendered a total assembly size of ~ 600 Mbp and was highly fragmented with an N50 of only 2.2 kbp. Further assembly-based and assembly-free analyses revealed that the very limited assembly contiguity is due to the B. improvisus genome having an extremely high nucleotide diversity (π) in coding regions (average π ≈ 5% and average π in fourfold degenerate sites ≈ 20%), and an overall high repeat content (at least 40%). We also report on high variation in the α-octopamine receptor OctA (average π = 3.6%), which might increase the risk that barnacle populations evolve resistance toward antifouling agents. The genomic features described here can help in planning for a future high-quality reference genome, which is urgently needed to properly explore and understand proteins of interest in barnacle biology and marine biotechnology and for developing better antifouling strategies.


Assuntos
Genoma , Thoracica/genética , Animais , Incrustação Biológica , Nucleotídeos , Receptores de Amina Biogênica/genética
11.
F1000Res ; 9: 63, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32269765

RESUMO

Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at https://github.com/nf-core/sarek and at https://nf-co.re/sarek/.


Assuntos
Células Germinativas , Software , Sequenciamento Completo do Genoma/métodos , Fluxo de Trabalho , Humanos
12.
BMC Biol ; 18(1): 78, 2020 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-32605573

RESUMO

BACKGROUND: Sex chromosomes have evolved independently multiple times in eukaryotes and are therefore considered a prime example of convergent genome evolution. Sex chromosomes are known to emerge after recombination is halted between a homologous pair of chromosomes, and this leads to a range of non-adaptive modifications causing gradual degeneration and gene loss on the sex-limited chromosome. However, the proximal causes of recombination suppression and the pace at which degeneration subsequently occurs remain unclear. RESULTS: Here, we use long- and short-read single-molecule sequencing approaches to assemble and annotate a draft genome of the basket willow, Salix viminalis, a species with a female heterogametic system at the earliest stages of sex chromosome emergence. Our single-molecule approach allowed us to phase the emerging Z and W haplotypes in a female, and we detected very low levels of Z/W single-nucleotide divergence in the non-recombining region. Linked-read sequencing of the same female and an additional male (ZZ) revealed the presence of two evolutionary strata supported by both divergence between the Z and W haplotypes and by haplotype phylogenetic trees. Gene order is still largely conserved between the Z and W homologs, although the W-linked region contains genes involved in cytokinin signaling regulation that are not syntenic with the Z homolog. Furthermore, we find no support across multiple lines of evidence for inversions, which have long been assumed to halt recombination between the sex chromosomes. CONCLUSIONS: Our data suggest that selection against recombination is a more gradual process at the earliest stages of sex chromosome formation than would be expected from an inversion and may result instead from the accumulation of transposable elements. Our results present a cohesive understanding of the earliest genomic consequences of recombination suppression as well as valuable insights into the initial stages of sex chromosome formation and regulation of sex differentiation.


Assuntos
Cromossomos de Plantas , Genoma de Planta , Salix/genética
13.
Nat Commun ; 11(1): 1842, 2020 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-32296054

RESUMO

Despite considerable progress in schizophrenia genetics, most findings have been for large rare structural variants and common variants in well-imputed regions with few genes implicated from exome sequencing. Whole genome sequencing (WGS) can potentially provide a more complete enumeration of etiological genetic variation apart from the exome and regions of high linkage disequilibrium. We analyze high-coverage WGS data from 1162 Swedish schizophrenia cases and 936 ancestry-matched population controls. Our main objective is to evaluate the contribution to schizophrenia etiology from a variety of genetic variants accessible to WGS but not by previous technologies. Our results suggest that ultra-rare structural variants that affect the boundaries of topologically associated domains (TADs) increase risk for schizophrenia. Alterations in TAD boundaries may lead to dysregulation of gene expression. Future mechanistic studies will be needed to determine the precise functional effects of these variants on biology.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Esquizofrenia/genética , Encéfalo/metabolismo , Exoma/genética , Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Sistema Nervoso/metabolismo , Controle de Qualidade , Análise de Sequência de DNA
14.
Nat Ecol Evol ; 3(12): 1725-1730, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31740847

RESUMO

Genes with sex-biased expression show a number of unique properties and this has been seen as evidence for conflicting selection pressures in males and females, forming a genetic 'tug-of-war' between the sexes. However, we lack studies of taxa where an understanding of conflicting phenotypic selection in the sexes has been linked with studies of genomic signatures of sexual conflict. Here, we provide such a link. We used an insect where sexual conflict is unusually well understood, the seed beetle Callosobruchus maculatus, to test for molecular genetic signals of sexual conflict across genes with varying degrees of sex-bias in expression. We sequenced, assembled and annotated its genome and performed population resequencing of three divergent populations. Sex-biased genes showed increased levels of genetic diversity and bore a remarkably clear footprint of relaxed purifying selection. Yet, segregating genetic variation was also affected by balancing selection in weakly female-biased genes, while male-biased genes showed signs of overall purifying selection. Female-biased genes contributed disproportionally to shared polymorphism across populations, while male-biased genes, male seminal fluid protein genes and sex-linked genes did not. Genes showing genomic signatures consistent with sexual conflict generally matched life-history phenotypes known to experience sexually antagonistic selection in this species. Our results highlight metabolic and reproductive processes, confirming the key role of general life-history traits in sexual conflict.


Assuntos
Seleção Genética , Caracteres Sexuais , Feminino , Genoma , Genômica , Masculino , Fenótipo
15.
Proc Natl Acad Sci U S A ; 115(46): E10970-E10978, 2018 11 13.
Artigo em Inglês | MEDLINE | ID: mdl-30373829

RESUMO

The Populus genus is one of the major plant model systems, but genomic resources have thus far primarily been available for poplar species, and primarily Populus trichocarpa (Torr. & Gray), which was the first tree with a whole-genome assembly. To further advance evolutionary and functional genomic analyses in Populus, we produced genome assemblies and population genetics resources of two aspen species, Populus tremula L. and Populus tremuloides Michx. The two aspen species have distributions spanning the Northern Hemisphere, where they are keystone species supporting a wide variety of dependent communities and produce a diverse array of secondary metabolites. Our analyses show that the two aspens share a similar genome structure and a highly conserved gene content with P. trichocarpa but display substantially higher levels of heterozygosity. Based on population resequencing data, we observed widespread positive and negative selection acting on both coding and noncoding regions. Furthermore, patterns of genetic diversity and molecular evolution in aspen are influenced by a number of features, such as expression level, coexpression network connectivity, and regulatory variation. To maximize the community utility of these resources, we have integrated all presented data within the PopGenIE web resource (PopGenIE.org).


Assuntos
Populus/genética , Evolução Biológica , DNA de Plantas/genética , Evolução Molecular , Variação Genética , Genética Populacional/métodos , Genoma de Planta , Genômica , Desequilíbrio de Ligação/genética , Filogenia , Seleção Genética/genética , Análise de Sequência de DNA/métodos , Árvores/genética
16.
Genome Biol ; 19(1): 72, 2018 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-29866176

RESUMO

BACKGROUND: The initiation of growth cessation and dormancy represent critical life-history trade-offs between survival and growth and have important fitness effects in perennial plants. Such adaptive life-history traits often show strong local adaptation along environmental gradients but, despite their importance, the genetic architecture of these traits remains poorly understood. RESULTS: We integrate whole genome re-sequencing with environmental and phenotypic data from common garden experiments to investigate the genomic basis of local adaptation across a latitudinal gradient in European aspen (Populus tremula). A single genomic region containing the PtFT2 gene mediates local adaptation in the timing of bud set and explains 65% of the observed genetic variation in bud set. This locus is the likely target of a recent selective sweep that originated right before or during colonization of northern Scandinavia following the last glaciation. Field and greenhouse experiments confirm that variation in PtFT2 gene expression affects the phenotypic variation in bud set that we observe in wild natural populations. CONCLUSIONS: Our results reveal a major effect locus that determines the timing of bud set and that has facilitated rapid adaptation to shorter growing seasons and colder climates in European aspen. The discovery of a single locus explaining a substantial fraction of the variation in a key life-history trait is remarkable, given that such traits are generally considered to be highly polygenic. These findings provide a dramatic illustration of how loci of large-effect for adaptive traits can arise and be maintained over large geographical scales in natural populations.


Assuntos
Adaptação Fisiológica/genética , Loci Gênicos/genética , Variação Genética/genética , Plantas/genética , Genes de Plantas/genética , Genoma de Planta/genética , Características de História de Vida , Fenótipo , Populus/genética
18.
Eur J Hum Genet ; 25(11): 1253-1260, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-28832569

RESUMO

Here we describe the SweGen data set, a comprehensive map of genetic variation in the Swedish population. These data represent a basic resource for clinical genetics laboratories as well as for sequencing-based association studies by providing information on genetic variant frequencies in a cohort that is well matched to national patient cohorts. To select samples for this study, we first examined the genetic structure of the Swedish population using high-density SNP-array data from a nation-wide cohort of over 10 000 Swedish-born individuals included in the Swedish Twin Registry. A total of 1000 individuals, reflecting a cross-section of the population and capturing the main genetic structure, were selected for whole-genome sequencing. Analysis pipelines were developed for automated alignment, variant calling and quality control of the sequencing data. This resulted in a genome-wide collection of aggregated variant frequencies in the Swedish population that we have made available to the scientific community through the website https://swefreq.nbis.se. A total of 29.2 million single-nucleotide variants and 3.8 million indels were detected in the 1000 samples, with 9.9 million of these variants not present in current databases. Each sample contributed with an average of 7199 individual-specific variants. In addition, an average of 8645 larger structural variants (SVs) were detected per individual, and we demonstrate that the population frequencies of these SVs can be used for efficient filtering analyses. Finally, our results show that the genetic diversity within Sweden is substantial compared with the diversity among continental European populations, underscoring the relevance of establishing a local reference data set.


Assuntos
Genoma Humano , Polimorfismo de Nucleotídeo Único , Sistema de Registros , Conjuntos de Dados como Assunto , Estudo de Associação Genômica Ampla , Humanos , Suécia , Gêmeos/genética
19.
Nucleic Acids Res ; 45(5): 2629-2643, 2017 03 17.
Artigo em Inglês | MEDLINE | ID: mdl-28100699

RESUMO

Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies.


Assuntos
Proteínas Fúngicas/genética , Genoma Fúngico , Malassezia/genética , Anotação de Sequência Molecular/métodos , Proteogenômica/métodos , Genes Fúngicos , Genoma Mitocondrial , Peptídeos/genética , Domínios Proteicos , Análise de Sequência de RNA
20.
Nature ; 535(7611): 294-8, 2016 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-27411634

RESUMO

Vascular and haematopoietic cells organize into specialized tissues during early embryogenesis to supply essential nutrients to all organs and thus play critical roles in development and disease. At the top of the haemato-vascular specification cascade lies cloche, a gene that when mutated in zebrafish leads to the striking phenotype of loss of most endothelial and haematopoietic cells and a significant increase in cardiomyocyte numbers. Although this mutant has been analysed extensively to investigate mesoderm diversification and differentiation and continues to be broadly used as a unique avascular model, the isolation of the cloche gene has been challenging due to its telomeric location. Here we used a deletion allele of cloche to identify several new cloche candidate genes within this genomic region, and systematically genome-edited each candidate. Through this comprehensive interrogation, we succeeded in isolating the cloche gene and discovered that it encodes a PAS-domain-containing bHLH transcription factor, and that it is expressed in a highly specific spatiotemporal pattern starting during late gastrulation. Gain-of-function experiments show that it can potently induce endothelial gene expression. Epistasis experiments reveal that it functions upstream of etv2 and tal1, the earliest expressed endothelial and haematopoietic transcription factor genes identified to date. A mammalian cloche orthologue can also rescue blood vessel formation in zebrafish cloche mutants, indicating a highly conserved role in vertebrate vasculogenesis and haematopoiesis. The identification of this master regulator of endothelial and haematopoietic fate enhances our understanding of early mesoderm diversification and may lead to improved protocols for the generation of endothelial and haematopoietic cells in vivo and in vitro.


Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Células Sanguíneas/citologia , Células Sanguíneas/metabolismo , Diferenciação Celular/genética , Células Endoteliais/citologia , Células Endoteliais/metabolismo , Proteínas de Peixe-Zebra/metabolismo , Animais , Fatores de Transcrição Hélice-Alça-Hélice Básicos/química , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Vasos Sanguíneos/citologia , Vasos Sanguíneos/embriologia , Vasos Sanguíneos/metabolismo , Sequência Conservada , Epistasia Genética , Deleção de Genes , Sequências Hélice-Alça-Hélice , Hematopoese , Mesoderma/citologia , Mesoderma/embriologia , Mesoderma/metabolismo , Mutação , Estrutura Terciária de Proteína , Proteínas Proto-Oncogênicas/genética , Proteína 1 de Leucemia Linfocítica Aguda de Células T , Peixe-Zebra/embriologia , Peixe-Zebra/genética , Proteínas de Peixe-Zebra/química , Proteínas de Peixe-Zebra/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...