RESUMO
BACKGROUND: To unravel the evolutionary history of a complex group, a comprehensive reconstruction of its phylogenetic relationships is crucial. This requires meticulous taxon sampling and careful consideration of multiple characters to ensure a complete and accurate reconstruction. The phylogenetic position of the Orestias genus has been estimated partly on unavailable or incomplete information. As a consequence, it was assigned to the family Cyprindontidae, relating this Andean fish to other geographically distant genera distributed in the Mediterranean, Middle East and North and Central America. In this study, using complete genome sequencing, we aim to clarify the phylogenetic position of Orestias within the Cyprinodontiformes order. RESULTS: We sequenced the genome of three Orestias species from the Andean Altiplano. Our analysis revealed that the small genome size in this genus (~ 0.7 Gb) was caused by a contraction in transposable element (TE) content, particularly in DNA elements and short interspersed nuclear elements (SINEs). Using predicted gene sequences, we generated a phylogenetic tree of Cyprinodontiformes using 902 orthologs extracted from all 32 available genomes as well as three outgroup species. We complemented this analysis with a phylogenetic reconstruction and time calibration considering 12 molecular markers (eight nuclear and four mitochondrial genes) and a stratified taxon sampling to consider 198 species of nearly all families and genera of this order. Overall, our results show that phylogenetic closeness is directly related to geographical distance. Importantly, we found that Orestias is not part of the Cyprinodontidae family, and that it is more closely related to the South American fish fauna, being the Fluviphylacidae the closest sister group. CONCLUSIONS: The evolutionary history of the Orestias genus is linked to the South American ichthyofauna and it should no longer be considered a member of the Cyprinodontidae family. Instead, we submit that Orestias belongs to the Orestiidae family, as suggested by Freyhof et al. (2017), and that it is the sister group of the Fluviphylacidae family, distributed in the Amazonian and Orinoco basins. These two groups likely diverged during the Late Eocene concomitant with hydrogeological changes in the South American landscape.
Assuntos
Ciprinodontiformes , Evolução Molecular , Genoma , Filogenia , Animais , Ciprinodontiformes/genética , Ciprinodontiformes/classificação , Elementos de DNA Transponíveis/genética , Tamanho do GenomaRESUMO
Orestias ascotanensis (Cyprinodontidae) is a teleost pupfish endemic to springs feeding into the Ascotan saltpan in the Chilean Altiplano (3,700 m.a.s.l.) and represents an opportunity to study adaptations to high-altitude aquatic environments. We have de novo assembled the genome of O. ascotanensis at high coverage. Comparative analysis of the O. ascotanensis genome showed an overall process of contraction, including loss of genes related to G-protein signaling, chemotaxis and signal transduction, while there was expansion of gene families associated with microtubule-based movement and protein ubiquitination. We identified 818 genes under positive selection, many of which are involved in DNA repair. Additionally, we identified novel and conserved microRNAs expressed in O. ascotanensis and its closely-related species, Orestias gloriae. Our analysis suggests that positive selection and expansion of genes that preserve genome stability are a potential adaptive mechanism to cope with the increased solar UV radiation to which high-altitude animals are exposed to.
Assuntos
Fundulidae , Peixes Listrados , Adaptação Fisiológica/genética , Altitude , Animais , Fundulidae/genética , Peixes Listrados/genética , Filogenia , TranscriptomaRESUMO
The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.
Assuntos
Diploide , Evolução Molecular , Duplicação Gênica/genética , Genes Duplicados/genética , Genoma/genética , Salmo salar/genética , Animais , Elementos de DNA Transponíveis/genética , Feminino , Genômica , Masculino , Modelos Genéticos , Mutagênese/genética , Filogenia , Padrões de Referência , Salmo salar/classificação , Homologia de SequênciaRESUMO
BACKGROUND: The rice weevil Sitophilus oryzae is one of the most important agricultural pests, causing extensive damage to cereal in fields and to stored grains. S. oryzae has an intracellular symbiotic relationship (endosymbiosis) with the Gram-negative bacterium Sodalis pierantonius and is a valuable model to decipher host-symbiont molecular interactions. RESULTS: We sequenced the Sitophilus oryzae genome using a combination of short and long reads to produce the best assembly for a Curculionidae species to date. We show that S. oryzae has undergone successive bursts of transposable element (TE) amplification, representing 72% of the genome. In addition, we show that many TE families are transcriptionally active, and changes in their expression are associated with insect endosymbiotic state. S. oryzae has undergone a high gene expansion rate, when compared to other beetles. Reconstruction of host-symbiont metabolic networks revealed that, despite its recent association with cereal weevils (30 kyear), S. pierantonius relies on the host for several amino acids and nucleotides to survive and to produce vitamins and essential amino acids required for insect development and cuticle biosynthesis. CONCLUSIONS: Here we present the genome of an agricultural pest beetle, which may act as a foundation for pest control. In addition, S. oryzae may be a useful model for endosymbiosis, and studying TE evolution and regulation, along with the impact of TEs on eukaryotic genomes.
Assuntos
Besouros , Gorgulhos , Animais , Comunicação Celular , Elementos de DNA Transponíveis/genética , Grão Comestível , Humanos , Gorgulhos/genéticaRESUMO
BACKGROUND: Berry size is considered as one of the main selection criteria in table grapes breeding programs, due to the consumer preferences. However, berry size is a complex quantitive trait under polygenic control, and its genetic determination of berry weight is not yet fully understood. The aim of this work was to perform marker discovery using a transcriptomic approach, in order to identify and characterize SNP and InDel markers associated with berry size in table grapes. We used an integrative analysis based on RNA-Seq, SNP/InDel search and validation on table grape segregants and varieties with different genetic backgrounds. RESULTS: Thirty SNPs and eight InDels were identified using a transcriptomic approach (RNA-Seq). These markers were selected from SNP/InDel found among segregants from a Ruby x Sultanina population with contrasting phenotypes for berry size. The set of 38 SNP and InDel markers was distributed in eight chromosomes. Genotype-phenotype association analyses were performed using a set of 13 RxS segregants and 41 table grapes varieties with different genetic backgrounds during three seasons. The results showed several degrees of association of these markers with berry size (10.2 to 30.7%) as other berry-related traits such as length and width. The co-localization of SNP and /or InDel markers and previously reported QTLs and candidate genes associated with berry size were analysed. CONCLUSIONS: We identified a set of informative and transferable SNP and InDel markers associated with berry size. Our results suggest the suitability of SNPs and InDels as candidate markers for berry weight in seedless table grape breeding. The identification of genomic regions associated with berry weight in chromosomes 8, 15 and 17 was achieved with supporting evidence derived from a transcriptome experiment focused on SNP/InDel search, as well as from a QTL-linkage mapping approach. New regions possibly associated with berry weight in chromosomes 3, 6, 9 and 14 were identified.
Assuntos
Frutas/genética , Mutação INDEL , Polimorfismo de Nucleotídeo Único , Vitis/genética , Frutas/crescimento & desenvolvimento , Perfilação da Expressão Gênica , Marcadores Genéticos , Genótipo , Fenótipo , Locos de Características Quantitativas , RNA de Plantas , RNA-Seq , Vitis/crescimento & desenvolvimentoRESUMO
BACKGROUND: Current South American populations trace their origins mainly to three continental ancestries, i.e. European, Amerindian and African. Individual variation in relative proportions of each of these ancestries may be confounded with socio-economic factors due to population stratification. Therefore, ancestry is a potential confounder variable that should be considered in epidemiologic studies and in public health plans. However, there are few studies that have assessed the ancestry of the current admixed Chilean population. This is partly due to the high cost of genome-scale technologies commonly used to estimate ancestry. In this study we have designed a small panel of SNPs to accurately assess ancestry in the largest sampling to date of the Chilean mestizo population (n = 3349) from eight cities. Our panel is also able to distinguish between the two main Amerindian components of Chileans: Aymara from the north and Mapuche from the south. RESULTS: A panel of 150 ancestry-informative markers (AIMs) of SNP type was selected to maximize ancestry informativeness and genome coverage. Of these, 147 were successfully genotyped by KASPar assays in 2843 samples, with an average missing rate of 0.012, and a 0.95 concordance with microarray data. The ancestries estimated with the panel of AIMs had relative high correlations (0.88 for European, 0.91 for Amerindian, 0.70 for Aymara, and 0.68 for Mapuche components) with those obtained with AXIOM LAT1 array. The country's average ancestry was 0.53 ± 0.14 European, 0.04 ± 0.04 African, and 0.42 ± 0.14 Amerindian, disaggregated into 0.18 ± 0.15 Aymara and 0.25 ± 0.13 Mapuche. However, Mapuche ancestry was highest in the south (40.03%) and Aymara in the north (35.61%) as expected from the historical location of these ethnic groups. We make our results available through an online app and demonstrate how it can be used to adjust for ancestry when testing association between incidence of a disease and nongenetic risk factors. CONCLUSIONS: We have conducted the most extensive sampling, across many different cities, of current Chilean population. Ancestry varied significantly by latitude and human development. The panel of AIMs is available to the community for estimating ancestry at low cost in Chileans and other populations with similar ancestry.
Assuntos
Etnicidade/genética , Genética Populacional/organização & administração , Indígenas Sul-Americanos/genética , Polimorfismo de Nucleotídeo Único/genética , Grupos Populacionais/genética , Chile , Feminino , Frequência do Gene/genética , Marcadores Genéticos/genética , Genótipo , Técnicas de Genotipagem , Humanos , Masculino , Filogeografia , SalivaRESUMO
A broad portfolio of phenotypic diversity in natural organisms can buffer against exploitation and increase species persistence in disturbed ecosystems. The study of genomic variation that accounts for ecological and evolutionary adaptation can represent a powerful approach to extend understanding of phenotypic variation in nature. Here we present a chromosome-level reference genome assembly for Chinook salmon (Oncorhynchus tshawytscha; 2.36 Gb) that enabled association mapping of life-history variation and phenotypic traits for this species. Whole-genome re-sequencing of populations with distinct life-history traits provided evidence that divergent selection was extensive throughout the genome within and among phylogenetic lineages, indicating that a broad portfolio of phenotypic diversity exists in this species that is related to local adaptation and life-history variation. Association mapping with millions of genome-wide SNPs revealed that a genomic region of major effect on chromosome 28 was associated with phenotypes for premature and mature arrival to spawning grounds and was consistent across three distinct phylogenetic lineages. Our results demonstrate how genomic resources can enlighten the genetic basis of known phenotypes in exploited species and assist in clarifying phenotypic variation that may be difficult to observe in naturally occurring organisms.
Assuntos
Mapeamento Cromossômico , Genoma , Características de História de Vida , Reprodução/genética , Salmão/genética , Transcriptoma , Animais , Feminino , Variação Genética , Masculino , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.
Assuntos
Genoma de Planta/genética , Genômica , Solanum tuberosum/genética , Evolução Molecular , Duplicação Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas/genética , Variação Genética , Haplótipos/genética , Heterozigoto , Homozigoto , Imunidade Inata , Endogamia , Anotação de Sequência Molecular , Dados de Sequência Molecular , Doenças das Plantas/genética , Ploidias , Solanum tuberosum/fisiologiaRESUMO
The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations.
Assuntos
Sistemas de Gerenciamento de Base de Dados , Genômica , Humanos , Internet , Neoplasias/genética , ProteômicaRESUMO
BACKGROUND: Berry size is considered as one of the main selection criteria in table grape breeding programs. However, this is a quantitative and polygenic trait, and its genetic determination is still poorly understood. Considering its economic importance, it is relevant to determine its genetic architecture and elucidate the mechanisms involved in its expression. To approach this issue, an RNA-Seq experiment based on Illumina platform was performed (14 libraries), including seedless segregants with contrasting phenotypes for berry weight at fruit setting (FST) and 6-8 mm berries (B68) phenological stages. RESULTS: A group of 526 differentially expressed (DE) genes were identified, by comparing seedless segregants with contrasting phenotypes for berry weight: 101 genes from the FST stage and 463 from the B68 stage. Also, we integrated differential expression, principal components analysis (PCA), correlations and network co-expression analyses to characterize the transcriptome profiling observed in segregants with contrasting phenotypes for berry weight. After this, 68 DE genes were selected as candidate genes, and seven candidate genes were validated by real time-PCR, confirming their expression profiles. CONCLUSIONS: We have carried out the first transcriptome analysis focused on table grape seedless segregants with contrasting phenotypes for berry weight. Our findings contributed to the understanding of the mechanisms involved in berry weight determination. Also, this comparative transcriptome profiling revealed candidate genes for berry weight which could be evaluated as selection tools in table grape breeding programs.
Assuntos
Frutas/genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica no Desenvolvimento , Regulação da Expressão Gênica de Plantas , Vitis/genética , Análise por Conglomerados , Frutas/crescimento & desenvolvimento , Frutas/fisiologia , Ontologia Genética , Genes de Plantas/genética , Genótipo , Fenótipo , Melhoramento Vegetal/métodos , Análise de Componente Principal , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Sementes/genética , Sementes/crescimento & desenvolvimento , Sementes/fisiologia , Análise de Sequência de RNA/métodos , Vitis/crescimento & desenvolvimento , Vitis/fisiologiaRESUMO
BACKGROUND: Pisciricketssia salmonis is the causal agent of Salmon Rickettsial Syndrome (SRS), which affects salmon species and causes severe economic losses. Selective breeding for disease resistance represents one approach for controlling SRS in farmed Atlantic salmon. Knowledge concerning the architecture of the resistance trait is needed before deciding on the most appropriate approach to enhance artificial selection for P. salmonis resistance in Atlantic salmon. The purpose of the study was to dissect the genetic variation in the resistance to this pathogen in Atlantic salmon. METHODS: 2,601 Atlantic salmon smolts were experimentally challenged against P. salmonis by means of intra-peritoneal injection. These smolts were the progeny of 40 sires and 118 dams from a Chilean breeding population. Mortalities were recorded daily and the experiment ended at day 40 post-inoculation. Fish were genotyped using a 50K Affymetrix® Axiom® myDesignTM Single Nucleotide Polymorphism (SNP) Genotyping Array. A Genome Wide Association Analysis was performed on data from the challenged fish. Linear regression and logistic regression models were tested. RESULTS: Genome Wide Association Analysis indicated that resistance to P. salmonis is a moderately polygenic trait. There were five SNPs in chromosomes Ssa01 and Ssa17 significantly associated with the traits analysed. The proportion of the phenotypic variance explained by each marker is small, ranging from 0.007 to 0.045. Candidate genes including interleukin receptors and fucosyltransferase have been found to be physically linked with these genetic markers and may play an important role in the differential immune response against this pathogen. CONCLUSIONS: Due to the small amount of variance explained by each significant marker we conclude that genetic resistance to this pathogen can be more efficiently improved with the implementation of genetic evaluations incorporating genotype information from a dense SNP array.
Assuntos
Cromossomos , Resistência à Doença/genética , Doenças dos Peixes/genética , Doenças dos Peixes/microbiologia , Estudo de Associação Genômica Ampla , Piscirickettsia , Locos de Características Quantitativas , Salmo salar/genética , Salmo salar/microbiologia , Alelos , Animais , Doenças dos Peixes/mortalidade , Frequência do Gene , Estudos de Associação Genética , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Característica Quantitativa HerdávelRESUMO
BACKGROUND: Grapevine (Vitis vinifera L.) is the most important Mediterranean fruit crop, used to produce both wine and spirits as well as table grape and raisins. Wine and table grape cultivars represent two divergent germplasm pools with different origins and domestication history, as well as differential characteristics for berry size, cluster architecture and berry chemical profile, among others. 'Sultanina' plays a pivotal role in modern table grape breeding providing the main source of seedlessness. This cultivar is also one of the most planted for fresh consumption and raisins production. Given its importance, we sequenced it and implemented a novel strategy for the de novo assembly of its highly heterozygous genome. RESULTS: Our approach produced a draft genome of 466 Mb, recovering 82% of the genes present in the grapevine reference genome; in addition, we identified 240 novel genes. A large number of structural variants and SNPs were identified. Among them, 45 (21 SNPs and 24 INDELs) were experimentally confirmed in 'Sultanina' and six SNPs in other 23 table grape varieties. Transposable elements corresponded to ca. 80% of the repetitive sequences involved in structural variants and more than 2,000 genes were affected in their structure by these variants. Some of these genes are likely involved in embryo development, suggesting that they may contribute to seedlessness, a key trait for table grapes. CONCLUSIONS: This work produced the first structural variants and SNPs catalog for grapevine, constituting a novel and very powerful tool for genomic studies in this key fruit crop, particularly useful to support marker assisted breeding in table grapes.
Assuntos
Genoma de Planta/genética , Vitis/genética , Vinho , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Cyclin-dependent kinase 5 (Cdk5) is a proline-directed serine/threonine kinase predominantly active in the nervous system where it regulates several processes such as neuronal migration, cytoskeletal dynamics, axonal guidance, and neurotransmission. We constructed a position specific scoring matrix (PSSM) based on a dataset of sites shown to be phosphorylated both in vivo and in vitro by Cdk5. This dataset was curated manually through an exhaustive search of published experimental data. We then used this PSSM to perform a search in the mouse proteome through Scansite, a web-based tool for matching sequence patterns in large databases. Considering a stringent cut-off score of 0.5, we identified 354 new putative sites present in 291 proteins. In order to assess the robustness of our results, ten random subsets (of 80 sites each) of the original dataset were used to construct new PSSMs, which were then used as input for a new Scansite search, leading to the recovery of 81% of the 354 sites by at least 5 PSSMs. In order to reduce the number of false positives in our sequence-based approach, we evaluated which of these predicted sites were phosphorylated in vivo as determined by multiple phosphoproteomics studies carried out through mass spectrometry and available in the PhosphoSitePlus database. This step resulted in a very promising list of 132 putative phosphorylation sites for Cdk5, of which, 51 are specifically phosphorylated in brain tissue, and some are involved in functions regulated by Cdk5 such as axonal growth, synaptic plasticity and neurotransmission. Other phosphorylation sites in our list suggest that Cdk5 might regulate processes through mechanisms not previously recognized such as the control of mRNA splicing.
Assuntos
Quinase 5 Dependente de Ciclina/metabolismo , Matrizes de Pontuação de Posição Específica , Proteoma/metabolismo , Animais , Encéfalo/metabolismo , Biologia Computacional , Camundongos , FosforilaçãoRESUMO
PURPOSE: The clinical course of pulmonary carcinoids ranges from indolent to fatal disease, suggesting that specific molecular alterations drive progression toward the fully malignant state. A similar spectrum of clinical phenotypes occurs in pediatric neuroblastoma, in which activation of telomerase reverse transcriptase (TERT) is decisive in determining the course of disease. We therefore investigated whether TERT expression defines the clinical fate of patients with pulmonary carcinoid. METHODS: TERT expression was examined by RNA sequencing in a test cohort and a validation cohort of pulmonary carcinoids (n = 88 and n = 105, respectively). A natural TERT expression cutoff was determined in the test cohort on the basis of the distribution of TERT expression, and its prognostic value was assessed by Kaplan-Meier survival estimates and multivariable analyses. Telomerase activity was validated by telomere repeat amplification protocol assay. RESULTS: Similar to neuroblastoma, TERT expression exhibited a bimodal distribution in pulmonary carcinoids, separating tumors into TERT-high and TERT-low subgroups. A natural TERT cutoff discriminated unfavorable from favorable clinical courses with high accuracy both in the test cohort (5-year overall survival [OS], 0.547 ± 0.132 v 1.0; P < .001) and the validation cohort (5-year OS, 0.788 ± 0.063 v 0.913 ± 0.048; P < .001). In line with these findings, telomerase activity was largely absent in TERT-low tumors, whereas it was readily detectable in TERT-high carcinoids. In multivariable analysis considering TERT expression, histology (typical v atypical carcinoid), and stage (≤IIA v ≥IIB), high TERT expression was an independent prognostic marker for poor survival, with a hazard ratio of 5.243 (95% CI, 1.943 to 14.148; P = .001). CONCLUSION: Our data demonstrate that high TERT expression defines clinically aggressive pulmonary carcinoids with fatal outcome, similar to neuroblastoma, indicating that activation of TERT may be a defining feature of lethal cancers.
RESUMO
BACKGROUND: Data normalization is a key step in gene expression analysis by qPCR. Endogenous control genes are used to estimate variations and experimental errors occurring during sample preparation and expression measurements. However, the transcription level of the most commonly used reference genes can vary considerably in samples obtained from different individuals, tissues, developmental stages and under variable physiological conditions, resulting in a misinterpretation of the performance of the target gene(s). This issue has been scarcely approached in woody species such as grapevine. RESULTS: A statistical criterion was applied to select a sub-set of 19 candidate reference genes from a total of 242 non-differentially expressed (NDE) genes derived from a RNA-Seq experiment comprising ca. 500 million reads obtained from 14 table-grape genotypes sampled at four phenological stages. From the 19 candidate reference genes, VvAIG1 (AvrRpt2-induced gene) and VvTCPB (T-complex 1 beta-like protein) were found to be the most stable ones after comparing the complete set of genotypes and phenological stages studied. This result was further validated by qPCR and geNorm analyses. CONCLUSIONS: Based on the evidence presented in this work, we propose to use the grapevine genes VvAIG1 or VvTCPB or both as a reference tool to normalize RNA expression in qPCR assays or other quantitative method intended to measure gene expression in berries and other tissues of this fruit crop, sampled at different developmental stages and physiological conditions.
Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Vitis/genética , Frutas/genética , Especificidade de Órgãos/genética , Análise de Sequência de RNARESUMO
Malignant pleural mesothelioma (MPM) is an aggressive cancer with rising incidence and challenging clinical management. Through a large series of whole-genome sequencing data, integrated with transcriptomic and epigenomic data using multiomics factor analysis, we demonstrate that the current World Health Organization classification only accounts for up to 10% of interpatient molecular differences. Instead, the MESOMICS project paves the way for a morphomolecular classification of MPM based on four dimensions: ploidy, tumor cell morphology, adaptive immune response and CpG island methylator profile. We show that these four dimensions are complementary, capture major interpatient molecular differences and are delimited by extreme phenotypes that-in the case of the interdependent tumor cell morphology and adapted immune response-reflect tumor specialization. These findings unearth the interplay between MPM functional biology and its genomic history, and provide insights into the variations observed in the clinical behavior of patients with MPM.
Assuntos
Neoplasias Pulmonares , Mesotelioma Maligno , Mesotelioma , Neoplasias Pleurais , Humanos , Mesotelioma Maligno/genética , Mesotelioma Maligno/complicações , Mesotelioma/genética , Mesotelioma/patologia , Multiômica , Neoplasias Pleurais/genética , Neoplasias Pleurais/patologia , Neoplasias Pulmonares/patologia , Biomarcadores Tumorais/genéticaRESUMO
Sulfobacillus thermosulfidooxidans strain Cutipay is a mixotrophic, acidophilic, moderately thermophilic bacterium isolated from mining environments of the north of Chile, making it an interesting subject for studying the bioleaching of copper. We introduce the draft genome sequence and annotation of this strain, which provide insights into its mechanisms for heavy metal resistance.
Assuntos
Bactérias/genética , Genoma Bacteriano , Bactérias/classificação , Chile , Mineração , Dados de Sequência Molecular , Microbiologia do SoloRESUMO
Acidithiobacillus ferrooxidans is a chemolithoautotrophic acidophilic bacterium that obtains its energy from the oxidation of ferrous iron, elemental sulfur, or reduced sulfur minerals. This capability makes it of great industrial importance due to its applications in biomining. During the industrial processes, A. ferrooxidans survives to stressing circumstances in its environment, such as an extremely acidic pH and high concentration of transition metals. In order to gain insight into the organization of A. ferrooxidans regulatory networks and to provide a framework for further studies in bacterial growth under extreme conditions, we applied a genome-wide annotation procedure to identify 87 A. ferrooxidans transcription factors. We classified them into 19 families that were conserved among diverse prokaryotic phyla. Our annotation procedure revealed that A. ferrooxidans genome contains several members of the ArsR and MerR families, which are involved in metal resistance and detoxification. Analysis of their sequences revealed known and potentially new mechanism to coordinate gene-expression in response to metal availability. A. ferrooxidans inhabit some of the most metal-rich environments known, thus transcription factors identified here seem to be good candidates for functional studies in order to determine their physiological roles and to place them into A. ferrooxidans transcriptional regulatory networks.
Assuntos
Acidithiobacillus/genética , Acidithiobacillus/metabolismo , Proteínas de Bactérias/metabolismo , Proteínas de Ligação a DNA/metabolismo , Regulação Bacteriana da Expressão Gênica , Metais/metabolismo , Fatores de Transcrição/metabolismo , Sequência de Aminoácidos , Proteínas de Bactérias/classificação , Proteínas de Bactérias/genética , Sítios de Ligação , Proteínas de Ligação a DNA/classificação , Proteínas de Ligação a DNA/genética , Homeostase , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência , Fatores de Transcrição/classificação , Fatores de Transcrição/genéticaRESUMO
BACKGROUND: Malignant pleural mesothelioma (MPM) is a rare understudied cancer associated with exposure to asbestos. So far, MPM patients have benefited marginally from the genomics medicine revolution due to the limited size or breadth of existing molecular studies. In the context of the MESOMICS project, we have performed the most comprehensive molecular characterization of MPM to date, with the underlying dataset made of the largest whole-genome sequencing series yet reported, together with transcriptome sequencing and methylation arrays for 120 MPM patients. RESULTS: We first provide comprehensive quality controls for all samples, of both raw and processed data. Due to the difficulty in collecting specimens from such rare tumors, a part of the cohort does not include matched normal material. We provide a detailed analysis of data processing of these tumor-only samples, showing that all somatic alteration calls match very stringent criteria of precision and recall. Finally, integrating our data with previously published multiomic MPM datasets (n = 374 in total), we provide an extensive molecular phenotype map of MPM based on the multitask theory. The generated map can be interactively explored and interrogated on the UCSC TumorMap portal (https://tumormap.ucsc.edu/?p=RCG_MESOMICS/MPM_Archetypes ). CONCLUSIONS: This new high-quality MPM multiomics dataset, together with the state-of-art bioinformatics and interactive visualization tools we provide, will support the development of precision medicine in MPM that is particularly challenging to implement in rare cancers due to limited molecular studies.
Assuntos
Neoplasias Pulmonares , Mesotelioma Maligno , Mesotelioma , Neoplasias Pleurais , Humanos , Mesotelioma/genética , Mesotelioma/patologia , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Neoplasias Pleurais/genética , Neoplasias Pleurais/patologia , FenótipoRESUMO
Generating accurate genome assemblies of large, repeat-rich human genomes has proved difficult using only long, error-prone reads, and most human genomes assembled from long reads add accurate short reads to polish the consensus sequence. Here we report an algorithm for hybrid assembly, WENGAN, that provides very high quality at low computational cost. We demonstrate de novo assembly of four human genomes using a combination of sequencing data generated on ONT PromethION, PacBio Sequel, Illumina and MGI technology. WENGAN implements efficient algorithms to improve assembly contiguity as well as consensus quality. The resulting genome assemblies have high contiguity (contig NG50: 17.24-80.64 Mb), few assembly errors (contig NGA50: 11.8-59.59 Mb), good consensus quality (QV: 27.84-42.88) and high gene completeness (BUSCO complete: 94.6-95.2%), while consuming low computational resources (CPU hours: 187-1,200). In particular, the WENGAN assembly of the haploid CHM13 sample achieved a contig NG50 of 80.64 Mb (NGA50: 59.59 Mb), which surpasses the contiguity of the current human reference genome (GRCh38 contig NG50: 57.88 Mb).