RESUMO
In the late 19th century, formalin fixation with paraffin-embedding (FFPE) of tissues was developed as a fixation and conservation method and is still used to this day in routine clinical and pathological practice. The implementation of state-of-the-art nucleic acid sequencing technologies has sparked much interest for using historical FFPE samples stored in biobanks as they hold promise in extracting new information from these valuable samples. However, formalin fixation chemically modifies DNA, which potentially leads to incorrect sequences or misinterpretations in downstream processing and data analysis. Many publications have concentrated on one type of DNA damage, but few have addressed the complete spectrum of FFPE-DNA damage. Here, we review mitigation strategies in (I) pre-analytical sample quality control, (II) DNA repair treatments, (III) analytical sample preparation and (IV) bioinformatic analysis of FFPE-DNA. We then provide recommendations that are tested and illustrated with DNA from 13-year-old liver specimens, one FFPE preserved and one fresh frozen, applying target-enriched sequencing. Thus, we show how DNA damage can be compensated, even when using low quantities (50 ng) of fragmented FFPE-DNA (DNA integrity number 2.0) that cannot be amplified well (Q129 bp/Q41 bp = 5%). Finally, we provide a checklist called 'ERROR-FFPE-DNA' that summarises recommendations for the minimal information in publications required for assessing fitness-for-purpose and inter-study comparison when using FFPE samples.
Assuntos
Análise de Sequência de DNA , DNA/genética , DNA/análise , Formaldeído , Inclusão em Parafina/métodos , Análise de Sequência de DNA/métodos , Fixação de Tecidos/métodosRESUMO
Very little information is available on the mutational landscape of vulvar squamous cell carcinoma (VSCC), a disease that mainly affects older women. Studies focusing on the mutational patterns of the currently recognized etiopathogenic types of this tumor (human papillomavirus [HPV]-associated [HPV-A], HPV-independent [HPV-I] with TP53 mutation [HPV-I/TP53mut], and HPV-I with wild-type TP53 [HPV-I/TP53wt]) are particularly rare, and there is almost no information on the prognostic implications of these abnormalities.Whole-exome DNA sequencing of 60 VSCC and matched normal tissues from each patient was performed. HPV detection, immunohistochemistry (IHC) for p16, p53, and mismatch repair proteins were also performed. Ten tumors (16.7%) were classified as HPV-A, 37 (61.7%) as HPV-I/TP53mut, and 13 (21.6%) as HPV-I/TP53wt. TP53 was the most frequently mutated gene (66.7%), followed by FAT1 (28.3%), CDKN2A (25.0%), RNF213 (23.3%), NFE2L2 (20%) and PIK3CA (20%). All the 60 tumors (100%) were DNA mismatch repair proficient. Seventeen tumors (28.3%) showed CCND1 gain. Bivariate analysis, adjusted for International Federation of Gynecology and Obstetrics stage, revealed that TP53 mutation, CCND1 gain, and the combination of the 2 alterations were strongly associated with impaired recurrence-free survival (hazard ratio, 4.4; P < .001) and disease-specific survival (hazard ratio, 6.1; P = .002). Similar results were obtained when p53 IHC status was used instead of TP53 status and when considering only HPV-I VSCC. However, in the latter category, p53 IHC maintained its prognostic impact only in combination with CCND1 gains. All tumors carried at least one potentially actionable genomic alteration. In conclusion, VSCCs with CCND1 gain represent a prognostically adverse category among HPV-I/TP53mut tumors. All patients with VSCCs are potential candidates for targeted therapy.
Assuntos
Carcinoma de Células Escamosas , Ciclina D1 , Sequenciamento do Exoma , Mutação , Proteína Supressora de Tumor p53 , Neoplasias Vulvares , Humanos , Feminino , Proteína Supressora de Tumor p53/genética , Neoplasias Vulvares/genética , Neoplasias Vulvares/patologia , Neoplasias Vulvares/virologia , Idoso , Pessoa de Meia-Idade , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/patologia , Carcinoma de Células Escamosas/virologia , Prognóstico , Ciclina D1/genética , Idoso de 80 Anos ou mais , Infecções por Papillomavirus/complicações , Infecções por Papillomavirus/genética , Infecções por Papillomavirus/virologia , Adulto , Biomarcadores Tumorais/genéticaRESUMO
Endometrial cancer (EC) is the second most frequent gynecological cancer worldwide. Although improvements in EC classification have enabled an accurate establishment of disease prognosis, women with a high-risk or recurrent EC face a dramatic situation due to limited further treatment options. Therefore, new strategies that closely mimic the disease are required to maximize drug development success. Patient-derived xenografts (PDXs) are widely recognized as a physiologically relevant preclinical model. Hence, we propose to molecularly and histologically validate EC PDX models. To reveal the molecular landscape of PDXs generated from 13 EC patients, we performed histological characterization and whole-exome sequencing analysis of tumor samples. We assessed the similarity between PDXs and their corresponding patient's tumor and, additionally, to an extended cohort of EC patients obtained from The Cancer Genome Atlas (TCGA). Finally, we performed functional enrichment analysis to reveal differences in molecular pathway activation in PDX models. We demonstrated that the PDX models had a well-defined and differentiated molecular profile that matched the genomic profile described by the TCGA for each EC subtype. Thus, we validated EC PDX's potential to reliably recapitulate the majority of histologic and molecular EC features. This work highlights the importance of a thorough characterization of preclinical models for the improvement of the success rate of drug-screening assays for personalized medicine.
Assuntos
Neoplasias do Endométrio , Recidiva Local de Neoplasia , Animais , Modelos Animais de Doenças , Neoplasias do Endométrio/patologia , Feminino , Genômica , Xenoenxertos , Humanos , Ensaios Antitumorais Modelo de XenoenxertoRESUMO
The growing number of sequenced genomes allows us now to address a key question in genetics and evolutionary biology: which genomic changes underlie particular phenotypic changes between species? Previously, we developed a computational framework called Forward Genomics that associates phenotypic to genomic differences by focusing on phenotypes that are independently lost in different lineages. However, our previous implementation had three main limitations. Here, we present two new Forward Genomics methods that overcome these limitations by (1) directly controlling for phylogenetic relatedness, (2) controlling for differences in evolutionary rates, and (3) computing a statistical significance. We demonstrate on large-scale simulated data and on real data that both new methods substantially improve the sensitivity to detect associations between phenotypic and genomic differences. We applied these new methods to detect genomic differences involved in the loss of vision in the blind mole rat and the cape golden mole, two independent subterranean mammals. Forward Genomics identified several genes that are enriched in functions related to eye development and the perception of light, as well as genes involved in the circadian rhythm. These new Forward Genomics methods represent a significant advance in our ability to discover the genomic basis underlying phenotypic differences between species. Source code: https://github.com/hillerlab/ForwardGenomics/.
Assuntos
Evolução Biológica , Biologia Computacional/métodos , Genômica/métodos , Animais , Sequência de Bases , Simulação por Computador , Evolução Molecular , Especiação Genética , Genótipo , Modelos Genéticos , Taxa de Mutação , Fenótipo , FilogeniaRESUMO
Balancing selection is an important evolutionary force that maintains genetic and phenotypic diversity in populations. Most studies in humans have focused on long-standing balancing selection, which persists over long periods of time and is generally shared across populations. But balanced polymorphisms can also promote fast adaptation, especially when the environment changes. To better understand the role of previously balanced alleles in novel adaptations, we analyzed in detail four loci as case examples of this mechanism. These loci show hallmark signatures of long-term balancing selection in African populations, but not in Eurasian populations. The disparity between populations is due to changes in allele frequencies, with intermediate frequency alleles in Africans (likely due to balancing selection) segregating instead at low- or high-derived allele frequency in Eurasia. We explicitly tested the support for different evolutionary models with an approximate Bayesian computation approach and show that the patterns in PKDREJ, SDR39U1, and ZNF473 are best explained by recent changes in selective pressure in certain populations. Specifically, we infer that alleles previously under long-term balancing selection, or alleles linked to them, were recently targeted by positive selection in Eurasian populations. Balancing selection thus likely served as a source of functional alleles that mediated subsequent adaptations to novel environments.
Assuntos
Genética Populacional/métodos , Seleção Genética , 3-Hidroxiacil-CoA Desidrogenases/genética , Alelos , Evolução Biológica , Proteínas de Ligação a DNA/genética , Bases de Dados de Ácidos Nucleicos , Evolução Molecular , Frequência do Gene , Interação Gene-Ambiente , Variação Genética , Humanos , Receptores de Superfície Celular/genética , Análise de Sequência de DNA/métodosRESUMO
We present the DNA sequence of 17,367 protein-coding genes in two Neandertals from Spain and Croatia and analyze them together with the genome sequence recently determined from a Neandertal from southern Siberia. Comparisons with present-day humans from Africa, Europe, and Asia reveal that genetic diversity among Neandertals was remarkably low, and that they carried a higher proportion of amino acid-changing (nonsynonymous) alleles inferred to alter protein structure or function than present-day humans. Thus, Neandertals across Eurasia had a smaller long-term effective population than present-day humans. We also identify amino acid substitutions in Neandertals and present-day humans that may underlie phenotypic differences between the two groups. We find that genes involved in skeletal morphology have changed more in the lineage leading to Neandertals than in the ancestral lineage common to archaic and modern humans, whereas genes involved in behavior and pigmentation have changed more on the modern human lineage.
Assuntos
Exoma , Variação Genética , Homem de Neandertal/genética , Substituição de Aminoácidos , Animais , Croácia , DNA/genética , Frequência do Gene , Humanos , Paleontologia , Filogenia , Polimorfismo de Nucleotídeo Único , Sibéria , EspanhaRESUMO
As humans migrated around the world, they came to inhabit environments that differ widely in the soil levels of certain micronutrients, including selenium (Se). Coupled with cultural variation in dietary practices, these migrations have led to a wide range of Se intake levels in populations around the world. Both excess and deficiency of Se in the diet can have adverse health consequences in humans, with severe Se deficiency resulting in diseases of the bone and heart. Se is required by humans mainly due to its function in selenoproteins, which contain the amino acid selenocysteine as one of their constituent residues. To understand the evolution of the use of this micronutrient in humans, we surveyed the patterns of polymorphism in all selenoprotein genes and genes involved in their regulation in 50 human populations. We find that single nucleotide polymorphisms from populations in Asia, particularly in populations living in the extreme Se-deficient regions of China, have experienced concerted shifts in their allele frequencies. Such differentiation in allele frequencies across genes is not observed in other regions of the world and is not expected under neutral evolution, being better explained by the action of recent positive selection. Thus, recent changes in the use and regulation of Se may harbor the genetic adaptations that helped humans inhabit environments that do not provide adequate levels of Se in the diet.
Assuntos
Adaptação Fisiológica/genética , Dieta , Evolução Molecular , Selênio , Selenoproteínas/genética , China , Frequência do Gene , Humanos , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , RNA Mensageiro/genética , Seleção Genética , Selênio/deficiência , Selenocisteína/genéticaRESUMO
Balancing selection maintains advantageous genetic and phenotypic diversity in populations. When selection acts for long evolutionary periods selected polymorphisms may survive species splits and segregate in present-day populations of different species. Here, we investigate the role of long-term balancing selection in the evolution of protein-coding sequences in the Homo-Pan clade. We sequenced the exome of 20 humans, 20 chimpanzees, and 20 bonobos and detected eight coding trans-species polymorphisms (trSNPs) that are shared among the three species and have segregated for approximately 14 My of independent evolution. Although the majority of these trSNPs were found in three genes of the major histocompatibility locus cluster, we also uncovered one coding trSNP (rs12088790) in the gene LAD1. All these trSNPs show clustering of sequences by allele rather than by species and also exhibit other signatures of long-term balancing selection, such as segregating at intermediate frequency and lying in a locus with high genetic diversity. Here, we focus on the trSNP in LAD1, a gene that encodes for Ladinin-1, a collagenous anchoring filament protein of basement membrane that is responsible for maintaining cohesion at the dermal-epidermal junction; the gene is also an autoantigen responsible for linear IgA disease. This trSNP results in a missense change (Leucine257Proline) and, besides altering the protein sequence, is associated with changes in gene expression of LAD1.
Assuntos
Autoantígenos/genética , Evolução Molecular , Variação Genética , Colágenos não Fibrilares/genética , Seleção Genética , Animais , Exoma/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Pan paniscus , Pan troglodytes , Polimorfismo de Nucleotídeo Único , Colágeno Tipo XVIIRESUMO
SelenoDB (http://www.selenodb.org) aims to provide high-quality annotations of selenoprotein genes, proteins and SECIS elements. Selenoproteins are proteins that contain the amino acid selenocysteine (Sec) and the first release of the database included annotations for eight species. Since the release of SelenoDB 1.0 many new animal genomes have been sequenced. The annotations of selenoproteins in new genomes usually contain many errors in major databases. For this reason, we have now fully annotated selenoprotein genes in 58 animal genomes. We provide manually curated annotations for human selenoproteins, whereas we use an automatic annotation pipeline to annotate selenoprotein genes in other animal genomes. In addition, we annotate the homologous genes containing cysteine (Cys) instead of Sec. Finally, we have surveyed genetic variation in the annotated genes in humans. We use exon capture and resequencing approaches to identify single-nucleotide polymorphisms in more than 50 human populations around the world. We thus present a detailed view of the genetic divergence of Sec- and Cys-containing genes in animals and their diversity in humans. The addition of these datasets into the second release of the database provides a valuable resource for addressing medical and evolutionary questions in selenium biology.
Assuntos
Bases de Dados de Proteínas , Variação Genética , Anotação de Sequência Molecular , Selenoproteínas/genética , Animais , Genes , Genoma , Humanos , Internet , Selenoproteínas/classificaçãoRESUMO
Establishing the genetic and geographic structure of populations is fundamental, both to understand their evolutionary past and preserve their future. Nevertheless, the patterns of genetic population structure are unknown for most endangered species. This is the case for bonobos (Pan paniscus), which, together with chimpanzees (Pan troglodytes), are humans' closest living relatives. Chimpanzees live across equatorial Africa and are classified into four subspecies,1 with some genetic population substructure even within subspecies. Conversely, bonobos live exclusively in the Democratic Republic of Congo and are considered a homogeneous group with low genetic diversity,2 despite some population structure inferred from mtDNA. Nevertheless, mtDNA aside, their genetic structure remains unknown, hampering our understanding of the species and conservation efforts. Mapping bonobo genetic diversity in space is, however, challenging because, being endangered, only non-invasive sampling is possible for wild individuals. Here, we jointly analyze the exomes and mtDNA from 20 wild-born bonobos, the whole genomes of 10 captive bonobos, and the mtDNA of 136 wild individuals. We identify three genetically distinct bonobo groups of inferred Central, Western, and Far-Western geographic origin within the bonobo range. We estimate the split time between the central and western populations to be â¼145,000 years ago and genetic differentiation to be in the order of that of the closest chimpanzee subspecies. Furthermore, our estimated long-term Ne for Far-West (â¼3,000) is among the lowest estimated for any great ape lineage. Our results highlight the need to attend to the bonobo substructure, both in terms of research and conservation.
RESUMO
Two independent exome sequencing initiatives aimed to identify new genes involved in the predisposition to nonpolyposis colorectal cancer led to the identification of heterozygous loss-of-function variants in NPAT, a gene that encodes a cyclin E/CDK2 effector required for S phase entry and a coactivator of histone transcription, in two families with multiple members affected with colorectal cancer. Enrichment of loss-of-function and predicted deleterious NPAT variants was identified in familial/early-onset colorectal cancer patients compared to non-cancer gnomAD individuals, further supporting the association with the disease. Previous studies in Drosophila models showed that NPAT abrogation results in chromosomal instability, increase of double strand breaks, and induction of tumour formation. In line with these results, colorectal cancers with NPAT somatic variants and no DNA repair defects have significantly higher aneuploidy levels than NPAT-wildtype colorectal cancers. In conclusion, our findings suggest that constitutional inactivating NPAT variants predispose to mismatch repair-proficient nonpolyposis colorectal cancer.
Assuntos
Mutação em Linhagem Germinativa , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Neoplasias Colorretais Hereditárias sem Polipose/genética , Mutação com Perda de Função , LinhagemRESUMO
The aim of this study was to determine how TERTp mutations impact glioblastoma prognosis. MATERIALS AND METHODS: TERTp mutations were assessed in a retrospective cohort of 258 uniformly treated glioblastoma patients. RNA-sequencing and whole exome sequencing results were available in a subset of patients. RESULTS: Overall, there were no differences in outcomes between patients with mutated TERTp-wt or TERTp. However, we found significant differences according to the type of TERTp mutation. Progression-free survival (mPFS) was 9.1 months for those with the C250T mutation and 7 months for those with either the C228T mutation or TERTp-wt (p = 0.016). Overall survival (mOS) was 21.9 and 15 months, respectively (p = 0.026). This differential effect was more pronounced in patients with MGMTp methylation (mPFS: p = 0.008; mOS: p = 0.021). Multivariate analysis identified the C250T mutation as an independent prognostic factor for longer mOS (HR 0.69; p = 0.044). We found no differences according to TERTp mutation status in molecular alterations common in glioblastoma, nor in copy number variants in genes related to alternative lengthening of telomeres. Nevertheless, in the gene enrichment analysis adjusted for MGMTp methylation status, some Reactome gene sets were differentially enriched, suggesting that the C250T mutation may exert a lesser effect on telomeres or chromosomes. CONCLUSIONS: In our series, patients exhibiting the C250T mutation had a more favorable prognosis compared to those with either TERPp-wt or TERTp C228T mutations. Additionally, our findings suggest a reduced involvement of the C250T mutation in the underlying biological mechanisms related to telomeres.
RESUMO
The exonuclease domain of DNA polymerases epsilon's catalytic subunit (POLE) removes misincorporated nucleotides, called proofreading. POLE-exonuclease mutations cause colorectal- and endometrial cancers with an extreme burden of single nucleotide substitutions. We recently reported that particularly the hereditary POLE exonuclease mutation N363K predisposes in addition to aggressive giant cell glioblastomas. We knocked-in this mutation homozygously into human cell lines and compared its properties to knock-ins of the likewise hereditary POLE L424V mutation and to a complete proofreading-inactivating mutation (exo-null). We found that N363K cells have higher mutation rates as both L424V- or exo-null mutant cells. In contrast to L424V cells, N363K cells expose a growth defect, replication stress and DNA damage. In non-transformed cells, these burdens lead to aneuploidy but macroscopically normal nuclei. In contrast, transformed N363K cells phenocopy the enlarged and disorganized nuclei of giant cell glioblastomas. Taken together, our data characterize a POLE exonuclease domain mutant that not only causes single nucleotide hypermutation, but in addition DNA damage and chromosome instability, leading to an extended tumor spectrum. Our results expand the understanding of the polymerase exonuclease domain and suggest that an assessment of both the mutational potential and the genetic instability might refine classification and treatment of POLE-mutated tumors.
RESUMO
Many patients experiencing a rare disease remain undiagnosed even after genomic testing. Reanalysis of existing genomic data has shown to increase diagnostic yield, although there are few systematic and comprehensive reanalysis efforts that enable collaborative interpretation and future reinterpretation. The Undiagnosed Rare Disease Program of Catalonia project collated previously inconclusive good quality genomic data (panels, exomes, and genomes) and standardized phenotypic profiles from 323 families (543 individuals) with a neurologic rare disease. The data were reanalyzed systematically to identify relatedness, runs of homozygosity, consanguinity, single-nucleotide variants, insertions and deletions, and copy number variants. Data were shared and collaboratively interpreted within the consortium through a customized Genome-Phenome Analysis Platform, which also enables future data reinterpretation. Reanalysis of existing genomic data provided a diagnosis for 20.7% of the patients, including 1.8% diagnosed after the generation of additional genomic data to identify a second pathogenic heterozygous variant. Diagnostic rate was significantly higher for family-based exome/genome reanalysis compared with singleton panels. Most new diagnoses were attributable to recent gene-disease associations (50.8%), additional or improved bioinformatic analysis (19.7%), and standardized phenotyping data integrated within the Undiagnosed Rare Disease Program of Catalonia Genome-Phenome Analysis Platform functionalities (18%).
Assuntos
Genômica , Doenças Raras , Biologia Computacional , Exoma , Humanos , Doenças Raras/diagnóstico , Doenças Raras/genética , Sequenciamento do ExomaRESUMO
The precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications. Challenge submissions included numerous innovative methods, with graph-based and machine learning methods scoring best for short-read and long-read datasets, respectively. With machine learning approaches, combining multiple sequencing technologies performed particularly well. Recent developments in sequencing and variant calling have enabled benchmarking variants in challenging genomic regions, paving the way for the identification of previously unknown clinically relevant variants.
RESUMO
Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catalog is in unfinished genomes. To answer this question, we have identified a set of core eukaryotic genes (CEGs), that are extremely highly conserved and which we believe are present in low copy numbers in higher eukaryotes. From an analysis of a phylogenetically diverse set of eukaryotic genome assemblies, we found that the proportion of CEGs mapped in draft genomes provides a useful metric for describing the gene space, and complements the commonly used N50 length and x-fold coverage values.
Assuntos
Genes , Genômica , Animais , Mapeamento Cromossômico , Humanos , Proteínas/genéticaRESUMO
BACKGROUND: Mechanisms driving the progression of chronic lymphocytic leukemia (CLL) from its early stages are not fully understood. The acquisition of molecular changes at the time of progression has been observed in a small fraction of patients, suggesting that CLL progression is not mainly driven by dynamic clonal evolution. In order to shed light on mechanisms that lead to CLL progression, we investigated longitudinal changes in both the genetic and immunological scenarios. METHODS: We performed genetic and immunological longitudinal analysis using paired primary samples from untreated CLL patients that underwent clinical progression (sampling at diagnosis and progression) and from patients with stable disease (sampling at diagnosis and at long-term asymptomatic follow-up). RESULTS: Molecular analysis showed limited and non-recurrent molecular changes at progression, indicating that clonal evolution is not the main driver of clinical progression. Our analysis of the immune kinetics found an increasingly dysfunctional CD8+ T cell compartment in progressing patients that was not observed in those patients that remained asymptomatic. Specifically, terminally exhausted effector CD8+ T cells (T-betdim/-EomeshiPD1hi) accumulated, while the the co-expression of inhibitory receptors (PD1, CD244 and CD160) increased, along with an altered gene expression profile in T cells only in those patients that progressed. In addition, malignant cells from patients at clinical progression showed enhanced capacity to induce exhaustion-related markers in CD8+ T cells ex vivo mainly through a mechanism dependent on soluble factors including IL-10. CONCLUSIONS: Altogether, we demonstrate that the interaction with the immune microenvironment plays a key role in clinical progression in CLL, thereby providing a rationale for the use of early immunotherapeutic intervention.
RESUMO
Brain metastases are the most common tumor of the brain with a dismal prognosis. A fraction of patients with brain metastasis benefit from treatment with immune checkpoint inhibitors (ICI) and the degree and phenotype of the immune cell infiltration has been used to predict response to ICI. However, the anatomical location of brain lesions limits access to tumor material to characterize the immune phenotype. Here, we characterize immune cells present in brain lesions and matched cerebrospinal fluid (CSF) using single-cell RNA sequencing combined with T cell receptor genotyping. Tumor immune infiltration and specifically CD8+ T cell infiltration can be discerned through the analysis of the CSF. Consistently, identical T cell receptor clonotypes are detected in brain lesions and CSF, confirming cell exchange between these compartments. The analysis of immune cells of the CSF can provide a non-invasive alternative to predict the response to ICI, as well as identify the T cell receptor clonotypes present in brain metastasis.
Assuntos
Neoplasias Encefálicas/imunologia , Líquido Cefalorraquidiano/imunologia , Leucócitos , Microambiente Tumoral/imunologia , Adenocarcinoma de Pulmão , Encéfalo/patologia , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Linfócitos T CD8-Positivos/imunologia , Humanos , Inibidores de Checkpoint Imunológico , Neoplasias Pulmonares , PrognósticoRESUMO
BACKGROUND: In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24). The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda. RESULTS: We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS) sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (> or = 75% nucleotide identity) elsewhere in the genome, but only 23% have identical copies (99% identity). The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. CONCLUSIONS: This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is a feasible goal.
Assuntos
Genoma de Planta , Pinus taeda/genética , Sequências Repetitivas de Ácido Nucleico , DNA de Plantas/química , Genes de Plantas , Variação Genética , Magnoliopsida/genética , Repetições Minissatélites , Retroelementos , Análise de Sequência de DNA , Sequências de Repetição em Tandem , Sequências Repetidas TerminaisRESUMO
Tetraodon nigroviridis is a freshwater puffer fish with the smallest known vertebrate genome. Here, we report a draft genome sequence with long-range linkage and substantial anchoring to the 21 Tetraodon chromosomes. Genome analysis provides a greatly improved fish gene catalogue, including identifying key genes previously thought to be absent in fish. Comparison with other vertebrates and a urochordate indicates that fish proteins have diverged markedly faster than their mammalian homologues. Comparison with the human genome suggests approximately 900 previously unannotated human genes. Analysis of the Tetraodon and human genomes shows that whole-genome duplication occurred in the teleost fish lineage, subsequent to its divergence from mammals. The analysis also makes it possible to infer the basic structure of the ancestral bony vertebrate genome, which was composed of 12 chromosomes, and to reconstruct much of the evolutionary history of ancient and recent chromosome rearrangements leading to the modern human karyotype.