RESUMO
The 2015 American College of Medical Genetics and Genomics and the Association for Molecular Pathology variant classification publication established a standard employed internationally to guide laboratories in variant assessment. Those recommendations included both pathogenic (PP1) and benign (BS4) criteria for evaluating the inheritance patterns of variants, but details of how to apply those criteria at appropriate evidence levels were sparse. Several publications have since attempted to provide additional guidance, but anecdotally, this issue is still challenging. Additionally, it is not clear that those prior efforts fully distinguished disease-gene identification considerations from variant pathogenicity considerations nor did they address autosomal-recessive and X-linked inheritance. Here, we have taken a mixed inductive and deductive approach to this problem using real diseases as examples. We have developed a practical heuristic for genetic co-segregation evidence and have also determined that the specific phenotype criterion (PP4) is inseparably coupled to the co-segregation criterion. We have also determined that negative evidence at one locus constitutes positive evidence for other loci for disorders with locus heterogeneity. Finally, we provide a points-based system for evaluating phenotype and co-segregation as evidence types to support or refute a locus and show how that can be integrated into the Bayesian framework now used for variant classification and consistent with the 2015 guidelines.
Assuntos
Testes Genéticos , Variação Genética , Humanos , Teorema de Bayes , Variação Genética/genética , Genoma Humano , FenótipoRESUMO
Recent studies exploring the impact of methylation in tumor evolution suggest that although the methylation status of many of the CpG sites are preserved across distinct lineages, others are altered as the cancer progresses. Because changes in methylation status of a CpG site may be retained in mitosis, they could be used to infer the progression history of a tumor via single-cell lineage tree reconstruction. In this work, we introduce the first principled distance-based computational method, Sgootr, for inferring a tumor's single-cell methylation lineage tree and for jointly identifying lineage-informative CpG sites that harbor changes in methylation status that are retained along the lineage. We apply Sgootr on single-cell bisulfite-treated whole-genome sequencing data of multiregionally sampled tumor cells from nine metastatic colorectal cancer patients, as well as multiregionally sampled single-cell reduced-representation bisulfite sequencing data from a glioblastoma patient. We show that the tumor lineages constructed reveal a simple model underlying tumor progression and metastatic seeding. A comparison of Sgootr against alternative approaches shows that Sgootr can construct lineage trees with fewer migration events and with more in concordance with the sequential-progression model of tumor evolution, with a running time a fraction of that used in prior studies. Lineage-informative CpG sites identified by Sgootr are in inter-CpG island (CGI) regions, as opposed to intra-CGIs, which have been the main regions of interest in genomic methylation-related analyses.
Assuntos
Metilação de DNA , Neoplasias , Humanos , Metilação de DNA/genética , Sulfitos , Análise de Sequência de DNA/métodos , Genoma , Neoplasias/genética , Ilhas de CpG/genéticaRESUMO
Rapid advances in high-throughput sequencing and a growing realization of the importance of evolutionary theory to cancer genomics have led to a proliferation of phylogenetic studies of tumour progression. These studies have yielded not only new insights but also a plethora of experimental approaches, sometimes reaching conflicting or poorly supported conclusions. Here, we consider this body of work in light of the key computational principles underpinning phylogenetic inference, with the goal of providing practical guidance on the design and analysis of scientifically rigorous tumour phylogeny studies. We survey the range of methods and tools available to the researcher, their key applications, and the various unsolved problems, closing with a perspective on the prospects and broader implications of this field.
Assuntos
Evolução Biológica , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Filogenia , Algoritmos , Animais , HumanosRESUMO
OBJECTIVE: Hepatocellular carcinoma (HCC) represents a typical inflammation-associated cancer. Tissue resident innate lymphoid cells (ILCs) have been suggested to control tumour surveillance. Here, we studied how the local cytokine milieu controls ILCs in HCC. DESIGN: We performed bulk RNA sequencing of HCC tissue as well as flow cytometry and single-cell RNA sequencing of enriched ILCs from non-tumour liver, margin and tumour core derived from 48 patients with HCC. Simultaneous measurement of protein and RNA expression at the single-cell level (AbSeq) identified precise signatures of ILC subgroups. In vitro culturing of ILCs was used to validate findings from in silico analysis. Analysis of RNA-sequencing data from large HCC cohorts allowed stratification and survival analysis based on transcriptomic signatures. RESULTS: RNA sequencing of tumour, non-tumour and margin identified tumour-dependent gradients, which were associated with poor survival and control of ILC plasticity. Single-cell RNA sequencing and flow cytometry of ILCs from HCC livers identified natural killer (NK)-like cells in the non-tumour tissue, losing their cytotoxic profile as they transitioned into tumour ILC1 and NK-like-ILC3 cells. Tumour ILC composition was mediated by cytokine gradients that directed ILC plasticity towards activated tumour ILC2s. This was liver-specific and not seen in ILCs from peripheral blood mononuclear cells. Patients with high ILC2/ILC1 ratio expressed interleukin-33 in the tumour that promoted ILC2 generation, which was associated with better survival. CONCLUSION: Our results suggest that the tumour cytokine milieu controls ILC composition and HCC outcome. Specific changes of cytokines modify ILC composition in the tumour by inducing plasticity and alter ILC function.
Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Carcinoma Hepatocelular/metabolismo , Citocinas/metabolismo , Humanos , Imunidade Inata , Células Matadoras Naturais/metabolismo , Leucócitos Mononucleares , Neoplasias Hepáticas/metabolismo , Linfócitos , RNA/metabolismo , Microambiente TumoralRESUMO
MOTIVATION: Computational reconstruction of clonal evolution in cancers has become a crucial tool for understanding how tumors initiate and progress and how this process varies across patients. The field still struggles, however, with special challenges of applying phylogenetic methods to cancers, such as the prevalence and importance of copy number alteration (CNA) and structural variation events in tumor evolution, which are difficult to profile accurately by prevailing sequencing methods in such a way that subsequent reconstruction by phylogenetic inference algorithms is accurate. RESULTS: In this work, we develop computational methods to combine sequencing with multiplex interphase fluorescence in situ hybridization to exploit the complementary advantages of each technology in inferring accurate models of clonal CNA evolution accounting for both focal changes and aneuploidy at whole-genome scales. By integrating such information in an integer linear programming framework, we demonstrate on simulated data that incorporation of FISH data substantially improves accurate inference of focal CNA and ploidy changes in clonal evolution from deconvolving bulk sequence data. Analysis of real glioblastoma data for which FISH, bulk sequence and single cell sequence are all available confirms the power of FISH to enhance accurate reconstruction of clonal copy number evolution in conjunction with bulk and optionally single-cell sequence data. AVAILABILITY AND IMPLEMENTATION: Source code is available on Github at https://github.com/CMUSchwartzLab/FISH_deconvolution. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Neoplasias , Software , Humanos , Hibridização in Situ Fluorescente , Filogenia , Algoritmos , Neoplasias/patologiaRESUMO
In this work we develop a novel algorithm for reconstructing the genomes of ancestral individuals, given genotype or sequence data from contemporary individuals and an extended pedigree of family relationships. A pedigree with complete genomes for every individual enables the study of allele frequency dynamics and haplotype diversity across generations, including deviations from neutrality such as transmission distortion. When studying heritable diseases, ancestral haplotypes can be used to augment genome-wide association studies and track disease inheritance patterns. The building blocks of our reconstruction algorithm are segments of Identity-By-Descent (IBD) shared between two or more genotyped individuals. The method alternates between identifying a source for each IBD segment and assembling IBD segments placed within each ancestral individual. Unlike previous approaches, our method is able to accommodate complex pedigree structures with hundreds of individuals genotyped at millions of SNPs. We apply our method to an Old Order Amish pedigree from Lancaster, Pennsylvania, whose founders came to North America from Europe during the early 18th century. The pedigree includes 1338 individuals from the past 12 generations, 394 with genotype data. The motivation for reconstruction is to understand the genetic basis of diseases segregating in the family through tracking haplotype transmission over time. Using our algorithm thread, we are able to reconstruct an average of 224 ancestral individuals per chromosome. For these ancestral individuals, on average we reconstruct 79% of their haplotypes. We also identify a region on chromosome 16 that is difficult to reconstruct-we find that this region harbors a short Amish-specific copy number variation and the gene HYDIN. thread was developed for endogamous populations, but can be applied to any extensive pedigree with the recent generations genotyped. We anticipate that this type of practical ancestral reconstruction will become more common and necessary to understand rare and complex heritable diseases in extended families.
Assuntos
Variações do Número de Cópias de DNA , Estudo de Associação Genômica Ampla/métodos , Haplótipos , Dinâmica Populacional , Algoritmos , Animais , Mapeamento Cromossômico/métodos , Simulação por Computador , Frequência do Gene , Ligação Genética , Genótipo , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Linhagem , Polimorfismo de Nucleotídeo Único , Software , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: The DNA sequences encoding ribosomal RNA genes (rRNAs) are commonly used as markers to identify species, including in metagenomics samples that may combine many organismal communities. The 16S small subunit ribosomal RNA (SSU rRNA) gene is typically used to identify bacterial and archaeal species. The nuclear 18S SSU rRNA gene, and 28S large subunit (LSU) rRNA gene have been used as DNA barcodes and for phylogenetic studies in different eukaryote taxonomic groups. Because of their popularity, the National Center for Biotechnology Information (NCBI) receives a disproportionate number of rRNA sequence submissions and BLAST queries. These sequences vary in quality, length, origin (nuclear, mitochondria, plastid), and organism source and can represent any region of the ribosomal cistron. RESULTS: To improve the timely verification of quality, origin and loci boundaries, we developed Ribovore, a software package for sequence analysis of rRNA sequences. The ribotyper and ribosensor programs are used to validate incoming sequences of bacterial and archaeal SSU rRNA. The ribodbmaker program is used to create high-quality datasets of rRNAs from different taxonomic groups. Key algorithmic steps include comparing candidate sequences against rRNA sequence profile hidden Markov models (HMMs) and covariance models of rRNA sequence and secondary-structure conservation, as well as other tests. Nine freely available blastn rRNA databases created and maintained with Ribovore are used for checking incoming GenBank submissions and used by the blastn browser interface at NCBI. Since 2018, Ribovore has been used to analyze more than 50 million prokaryotic SSU rRNA sequences submitted to GenBank, and to select at least 10,435 fungal rRNA RefSeq records from type material of 8350 taxa. CONCLUSION: Ribovore combines single-sequence and profile-based methods to improve GenBank processing and analysis of rRNA sequences. It is a standalone, portable, and extensible software package for the alignment, classification and validation of rRNA sequences. Researchers planning on submitting SSU rRNA sequences to GenBank are encouraged to download and use Ribovore to analyze their sequences prior to submission to determine which sequences are likely to be automatically accepted into GenBank.
Assuntos
Bases de Dados de Ácidos Nucleicos , RNA Ribossômico , DNA Ribossômico , Filogenia , RNA Ribossômico 16S/genética , RNA Ribossômico 18S/genética , Análise de Sequência de RNARESUMO
BACKGROUND: Engineered versions of adeno-associated virus (AAV) are commonly used in gene therapy but evidence revealing a potential oncogenic role of natural AAV in hepatocellular carcinoma (HCC) has raised concerns. The frequency of potentially oncogenic integrations has been reported in only a few populations. AAV infection and host genome integration in another type of liver cancer, cholangiocarcinoma (CCA), has been studied only in one cohort. All reported oncogenic AAV integrations in HCC come from strains resembling the fully sequenced AAV2 and partly sequenced AAV13. When AAV integration occurs, only a fragment of the AAV genome is detectable in later DNA or RNA sequencing. The integrated fragment is typically from the 3' end of the AAV genome, and this positional bias has been only partly explained. Three research groups searched for evidence of AAV integration in HCC RNAseq samples in the Cancer Genome Atlas (TCGA) but reported conflicting results. RESULTS: We collected and analyzed whole transcriptome and viral capture DNA sequencing in paired tumor and non-tumor samples from two liver cancer Asian cohorts from Thailand (N = 147, 47 HCC and 100 intrahepatic cholangiocarcinoma (iCCA)) and Mongolia (N = 70, all HCC). We found only one HCC patient with a potentially oncogenic integration of AAV, in contrast to higher frequency reported in European patients. There were no oncogenic AAV integrations in iCCA patients. AAV genomic segments are present preferentially in the non-tumor samples of Thai patients. By analyzing the AAV genome positions of oncogenic and non-oncogenic integrated fragments, we found that almost all the putative oncogenic integrations overlap the X gene, which is present and functional only in the strain AAV2 among all fully sequenced strains. This gene content difference could explain why putative oncogenic integrations from other AAV strains have not been reported. We resolved the discrepancies in previous analyses of AAV presence in TCGA HCC samples and extended it to CCA. There are 12 TCGA samples with an AAV segment and none are in Asian patients. AAV segments are present in preferentially in TCGA non-tumor samples, like what we observed in the Thai patients. CONCLUSIONS: Our findings suggest a minimal AAV risk of hepatocarcinogenesis in Asian liver cancer patients. The partial genome presence and positional bias of AAV integrations into the human genome has complicated analysis of possible roles of AAV in liver cancer.
Assuntos
Neoplasias dos Ductos Biliares , Carcinoma Hepatocelular , Neoplasias Hepáticas , Neoplasias dos Ductos Biliares/genética , Ductos Biliares Intra-Hepáticos , Carcinogênese , Carcinoma Hepatocelular/genética , Dependovirus/genética , Vírus da Hepatite B , Humanos , Neoplasias Hepáticas/genética , Tailândia , Integração Viral/genéticaRESUMO
MOTIVATION: Recent advances in single-cell sequencing (SCS) offer an unprecedented insight into tumor emergence and evolution. Principled approaches to tumor phylogeny reconstruction via SCS data are typically based on general computational methods for solving an integer linear program, or a constraint satisfaction program, which, although guaranteeing convergence to the most likely solution, are very slow. Others based on Monte Carlo Markov Chain or alternative heuristics not only offer no such guarantee, but also are not faster in practice. As a result, novel methods that can scale up to handle the size and noise characteristics of emerging SCS data are highly desirable to fully utilize this technology. RESULTS: We introduce PhISCS-BnB (phylogeny inference using SCS via branch and bound), a branch and bound algorithm to compute the most likely perfect phylogeny on an input genotype matrix extracted from an SCS dataset. PhISCS-BnB not only offers an optimality guarantee, but is also 10-100 times faster than the best available methods on simulated tumor SCS data. We also applied PhISCS-BnB on a recently published large melanoma dataset derived from the sublineages of a cell line involving 20 clones with 2367 mutations, which returned the optimal tumor phylogeny in <4 h. The resulting phylogeny agrees with and extends the published results by providing a more detailed picture on the clonal evolution of the tumor. AVAILABILITY AND IMPLEMENTATION: https://github.com/algo-cancer/PhISCS-BnB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Algoritmos , Neoplasias , Humanos , Cadeias de Markov , Neoplasias/genética , Filogenia , Análise de Sequência , SoftwareRESUMO
Prognosis in young patients with breast cancer is generally poor, yet considerable differences in clinical outcomes between individual patients exist. To understand the genetic basis of the disparate clinical courses, tumors were collected from 34 younger women, 17 with good and 17 with poor outcomes, as determined by disease-specific survival during a follow-up period of 17 years. The clinicopathologic parameters of the tumors were complemented with DNA image cytometry profiles, enumeration of copy numbers of eight breast cancer genes by multicolor fluorescence in situ hybridization, and targeted sequence analysis of 563 cancer genes. Both groups included diploid and aneuploid tumors. The degree of intratumor heterogeneity was significantly higher in aneuploid versus diploid cases, and so were gains of the oncogenes MYC and ZNF217. Significantly more copy number alterations were observed in the group with poor outcome. Almost all tumors in the group with long survival were classified as luminal A, whereas triple-negative tumors predominantly occurred in the short survival group. Mutations in PIK3CA were more common in the group with good outcome, whereas TP53 mutations were more frequent in patients with poor outcomes. This study shows that TP53 mutations and the extent of genomic imbalances are associated with poor outcome in younger breast cancer patients and thus emphasize the central role of genomic instability vis-a-vis tumor aggressiveness.
Assuntos
Neoplasias da Mama/genética , Variações do Número de Cópias de DNA , Instabilidade Genômica , Mutação , Proteína Supressora de Tumor p53/genética , Adulto , Biomarcadores Tumorais/genética , Neoplasias da Mama/mortalidade , Neoplasias da Mama/patologia , Intervalo Livre de Doença , Feminino , Regulação Neoplásica da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Pessoa de Meia-Idade , Prognóstico , Taxa de SobrevidaRESUMO
The COVID-19 pandemic caused by SARS-CoV-2 has is a global health challenge. Angiotensin-converting enzyme 2 (ACE2) is the host receptor for SARS-CoV-2 entry. Recent studies have suggested that patients with hypertension and diabetes treated with ACE inhibitors (ACEIs) or angiotensin receptor blockers have a higher risk of COVID-19 infection as these drugs could upregulate ACE2, motivating the study of ACE2 modulation by drugs in current clinical use. Here, we mined published datasets to determine the effects of hundreds of clinically approved drugs on ACE2 expression. We find that ACEIs are enriched for ACE2-upregulating drugs, while antineoplastic agents are enriched for ACE2-downregulating drugs. Vorinostat and isotretinoin are the top ACE2 up/downregulators, respectively, in cell lines. Dexamethasone, a corticosteroid used in treating severe acute respiratory syndrome and COVID-19, significantly upregulates ACE2 both in vitro and in vivo. Further top ACE2 regulators in vivo or in primary cells include erlotinib and bleomycin in the lung and vancomycin, cisplatin, and probenecid in the kidney. Our study provides leads for future work studying ACE2 expression modulators.
Assuntos
Antagonistas de Receptores de Angiotensina/farmacologia , Inibidores da Enzima Conversora de Angiotensina/farmacologia , Infecções por Coronavirus/tratamento farmacológico , Pneumonia Viral/tratamento farmacológico , Células A549 , Enzima de Conversão de Angiotensina 2 , Betacoronavirus , Bleomicina/farmacologia , COVID-19 , Dexametasona/farmacologia , Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos , Cloridrato de Erlotinib/farmacologia , Flufenazina/farmacologia , Células HEK293 , Humanos , Rim/efeitos dos fármacos , Pulmão/efeitos dos fármacos , Células MCF-7 , Pandemias , Peptidil Dipeptidase A , SARS-CoV-2 , Biologia de Sistemas , Regulação para Cima , Vemurafenib/farmacologia , Tratamento Farmacológico da COVID-19RESUMO
Modifier genes are believed to account for the clinical variability observed in many Mendelian disorders, but their identification remains challenging due to the limited availability of genomics data from large patient cohorts. Here, we present GENDULF (GENetic moDULators identiFication), one of the first methods to facilitate prediction of disease modifiers using healthy and diseased tissue gene expression data. GENDULF is designed for monogenic diseases in which the mechanism is loss of function leading to reduced expression of the mutated gene. When applied to cystic fibrosis, GENDULF successfully identifies multiple, previously established disease modifiers, including EHF, SLC6A14, and CLCA1. It is then utilized in spinal muscular atrophy (SMA) and predicts U2AF1 as a modifier whose low expression correlates with higher SMN2 pre-mRNA exon 7 retention. Indeed, knockdown of U2AF1 in SMA patient-derived cells leads to increased full-length SMN2 transcript and SMN protein expression. Taking advantage of the increasing availability of transcriptomic data, GENDULF is a novel addition to existing strategies for prediction of genetic disease modifiers, providing insights into disease pathogenesis and uncovering novel therapeutic targets.
Assuntos
Algoritmos , Mineração de Dados , Doença/genética , Genes Modificadores , Transcriptoma/genética , Estudos de Associação Genética , Ligação Genética , Células HEK293 , Humanos , Reprodutibilidade dos TestesRESUMO
BACKGROUND: GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. RESULTS: We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of "alerts" that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank's submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. CONCLUSION: VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.
Assuntos
Betacoronavirus , Infecções por Coronavirus , Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Pandemias , Pneumonia Viral , Software , Betacoronavirus/genética , COVID-19 , Infecções por Coronavirus/genética , Vírus de DNA , Genômica , Humanos , Anotação de Sequência Molecular/normas , Pneumonia Viral/genética , SARS-CoV-2 , VírusRESUMO
BACKGROUND: A single variant in NAA10 (c.471+2T>A), the gene encoding N-acetyltransferase 10, has been associated with Lenz microphthalmia syndrome. In this study, we aimed to identify causative variants in families with syndromic X-linked microphthalmia. METHODS: Three families, including 15 affected individuals with syndromic X-linked microphthalmia, underwent analyses including linkage analysis, exome sequencing and targeted gene sequencing. The consequences of two identified variants in NAA10 were evaluated using quantitative PCR and RNAseq. RESULTS: Genetic linkage analysis in family 1 supported a candidate region on Xq27-q28, which included NAA10. Exome sequencing identified a hemizygous NAA10 polyadenylation signal (PAS) variant, chrX:153,195,397T>C, c.*43A>G, which segregated with the disease. Targeted sequencing of affected males from families 2 and 3 identified distinct NAA10 PAS variants, chrX:g.153,195,401T>C, c.*39A>G and chrX:g.153,195,400T>C, c.*40A>G. All three variants were absent from gnomAD. Quantitative PCR and RNAseq showed reduced NAA10 mRNA levels and abnormal 3' UTRs in affected individuals. Targeted sequencing of NAA10 in 376 additional affected individuals failed to identify variants in the PAS. CONCLUSION: These data show that PAS variants are the most common variant type in NAA10-associated syndromic microphthalmia, suggesting reduced RNA is the molecular mechanism by which these alterations cause microphthalmia/anophthalmia. We reviewed recognised variants in PAS associated with Mendelian disorders and identified only 23 others, indicating that NAA10 harbours more than 10% of all known PAS variants. We hypothesise that PAS in other genes harbour unrecognised pathogenic variants associated with Mendelian disorders. The systematic interrogation of PAS could improve genetic testing yields.
Assuntos
Regiões 3' não Traduzidas , Estudos de Associação Genética , Predisposição Genética para Doença , Variação Genética , Acetiltransferase N-Terminal A/genética , Acetiltransferase N-Terminal E/genética , Poli A , Alelos , Anoftalmia , Feminino , Genes Ligados ao Cromossomo X , Genótipo , Humanos , Escore Lod , Masculino , Microftalmia , Linhagem , Análise de Sequência de DNA , Inativação do Cromossomo XRESUMO
BACKGROUND: Caspase activation and recruitment domain 11 (CARD11) encodes a scaffold protein in lymphocytes that links antigen receptor engagement with downstream signaling to nuclear factor κB, c-Jun N-terminal kinase, and mechanistic target of rapamycin complex 1. Germline CARD11 mutations cause several distinct primary immune disorders in human subjects, including severe combined immune deficiency (biallelic null mutations), B-cell expansion with nuclear factor κB and T-cell anergy (heterozygous, gain-of-function mutations), and severe atopic disease (loss-of-function, heterozygous, dominant interfering mutations), which has focused attention on CARD11 mutations discovered by using whole-exome sequencing. OBJECTIVES: We sought to determine the molecular actions of an extended allelic series of CARD11 and to characterize the expanding range of clinical phenotypes associated with heterozygous CARD11 loss-of-function alleles. METHODS: Cell transfections and primary T-cell assays were used to evaluate signaling and function of CARD11 variants. RESULTS: Here we report on an expanded cohort of patients harboring novel heterozygous CARD11 mutations that extend beyond atopy to include other immunologic phenotypes not previously associated with CARD11 mutations. In addition to (and sometimes excluding) severe atopy, heterozygous missense and indel mutations in CARD11 presented with immunologic phenotypes similar to those observed in signal transducer and activator of transcription 3 loss of function, dedicator of cytokinesis 8 deficiency, common variable immunodeficiency, neutropenia, and immune dysregulation, polyendocrinopathy, enteropathy, X-linked-like syndrome. Pathogenic variants exhibited dominant negative activity and were largely confined to the CARD or coiled-coil domains of the CARD11 protein. CONCLUSION: These results illuminate a broader phenotypic spectrum associated with CARD11 mutations in human subjects and underscore the need for functional studies to demonstrate that rare gene variants encountered in expected and unexpected phenotypes must nonetheless be validated for pathogenic activity.
Assuntos
Proteínas Adaptadoras de Sinalização CARD/genética , Proteínas Adaptadoras de Sinalização CARD/imunologia , Guanilato Ciclase/genética , Guanilato Ciclase/imunologia , Doenças do Sistema Imunitário/genética , Doenças do Sistema Imunitário/imunologia , Adulto , Feminino , Humanos , Masculino , Mutação , FenótipoRESUMO
Motivation: Nucleic acid sequences in public databases should not contain vector contamination, but many sequences in GenBank do (or did) contain vectors. The National Center for Biotechnology Information uses the program VecScreen to screen submitted sequences for contamination. Additional tools are needed to distinguish true-positive (contamination) from false-positive (not contamination) VecScreen matches. Results: A principal reason for false-positive VecScreen matches is that the sequence and the matching vector subsequence originate from closely related or identical organisms (for example, both originate in Escherichia coli). We collected information on the taxonomy of sources of vector segments in the UniVec database used by VecScreen. We used that information in two overlapping software pipelines for retrospective analysis of contamination in GenBank and for prospective analysis of contamination in new sequence submissions. Using the retrospective pipeline, we identified and corrected over 8000 contaminated sequences in the nonredundant nucleotide database. The prospective analysis pipeline has been in production use since April 2017 to evaluate some new GenBank submissions. Availability and implementation: Data on the sources of UniVec entries were included in release 10.0 (ftp://ftp.ncbi.nih.gov/pub/UniVec/). The main software is freely available at https://github.com/aaschaffer/vecscreen_plus_taxonomy. Contact: aschaffe@helix.nih.gov. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Bases de Dados de Ácidos Nucleicos/normas , Análise de Sequência de DNA/métodos , Software , Bactérias , EucariotosRESUMO
The Virus Variation Resource is a value-added viral sequence data resource hosted by the National Center for Biotechnology Information. The resource is located at http://www.ncbi.nlm.nih.gov/genome/viruses/variation/ and includes modules for seven viral groups: influenza virus, Dengue virus, West Nile virus, Ebolavirus, MERS coronavirus, Rotavirus A and Zika virus Each module is supported by pipelines that scan newly released GenBank records, annotate genes and proteins and parse sample descriptors and then map them to controlled vocabulary. These processes in turn support a purpose-built search interface where users can select sequences based on standardized gene, protein and metadata terms. Once sequences are selected, a suite of tools for downloading data, multi-sequence alignment and tree building supports a variety of user directed activities. This manuscript describes a series of features and functionalities recently added to the Virus Variation Resource.
Assuntos
Biologia Computacional/métodos , Surtos de Doenças , Variação Genética , Software , Viroses/epidemiologia , Viroses/virologia , Vírus/genética , Bases de Dados GenéticasRESUMO
BACKGROUND: Primary antibody deficiencies (PADs) are the most frequent primary immunodeficiencies in human subjects. The genetic causes of PADs are largely unknown. Sec61 translocon alpha 1 subunit (SEC61A1) is the major subunit of the Sec61 complex, which is the main polypeptide-conducting channel in the endoplasmic reticulum membrane. SEC61A1 is a target gene of spliced X-box binding protein 1 and strongly induced during plasma cell (PC) differentiation. OBJECTIVE: We identified a novel genetic defect and studied its pathologic mechanism in 11 patients from 2 unrelated families with PADs. METHODS: Whole-exome and targeted sequencing were conducted to identify novel genetic mutations. Functional studies were carried out ex vivo in primary cells of patients and in vitro in different cell lines to assess the effect of SEC61A1 mutations on B-cell differentiation and survival. RESULTS: We investigated 2 families with patients with hypogammaglobulinemia, severe recurrent respiratory tract infections, and normal peripheral B- and T-cell subpopulations. On in vitro stimulation, B cells showed an intrinsic deficiency to develop into PCs. Genetic analysis and targeted sequencing identified novel heterozygous missense (c.254T>A, p.V85D) and nonsense (c.1325G>T, p.E381*) mutations in SEC61A1, segregating with the disease phenotype. SEC61A1-V85D was deficient in cotranslational protein translocation, and it disturbed the cellular calcium homeostasis in HeLa cells. Moreover, SEC61A1-V85D triggered the terminal unfolded protein response in multiple myeloma cell lines. CONCLUSION: We describe a monogenic defect leading to a specific PC deficiency in human subjects, expanding our knowledge about the pathogenesis of antibody deficiencies.
Assuntos
Síndromes de Imunodeficiência/genética , Mutação/genética , Plasmócitos/patologia , Canais de Translocação SEC/genética , Agamaglobulinemia/genética , Agamaglobulinemia/metabolismo , Agamaglobulinemia/patologia , Linfócitos B/metabolismo , Linfócitos B/patologia , Cálcio/metabolismo , Diferenciação Celular/genética , Linhagem Celular , Linhagem Celular Tumoral , Exoma/genética , Células HEK293 , Células HeLa , Heterozigoto , Humanos , Síndromes de Imunodeficiência/metabolismo , Plasmócitos/metabolismo , Transporte Proteico/genética , Infecções Respiratórias/genética , Infecções Respiratórias/metabolismo , Infecções Respiratórias/patologia , Linfócitos T/metabolismo , Linfócitos T/patologia , Resposta a Proteínas não Dobradas/genéticaRESUMO
The clinical course of breast cancer varies from one patient to another. Currently, the choice of therapy relies on clinical parameters and histological and molecular tumor features. Alas, these markers are informative in only a subset of patients. Therefore, additional predictors of disease outcome would be valuable for treatment stratification. Extensive studies showed that the degree of variation of the nuclear DNA content, i.e., aneuploidy, determines prognosis. Our aim was to further elucidate the molecular basis of aneuploidy. We analyzed five diploid and six aneuploid tumors with more than 20 years of follow-up. By performing FISH with a multiplexed panel of 10 probes to enumerate copy numbers in individual cells, and by sequencing 563 cancer-related genes, we analyzed how aneuploidy is linked to intratumor heterogeneity. In our cohort, none of the patients with diploid tumors died of breast cancer during follow-up in contrast to four of six patients with aneuploid tumors (mean survival 86.4 months). The FISH analysis showed markedly increased genomic instability and intratumor heterogeneity in aneuploid tumors. MYC gain was observed in only 20% of the diploid cancers, while all aneuploid cases showed a gain. The mutation burden was similar in diploid and aneuploid tumors, however, TP53 mutations were not observed in diploid tumors, but in all aneuploid tumors in our collective. We conclude that quantitative measurements of intratumor heterogeneity by multiplex FISH, detection of MYC amplification and TP53 mutation could augment prognostication in breast cancer patients.
Assuntos
Aneuploidia , Neoplasias da Mama/genética , Mutação , Proteínas Proto-Oncogênicas c-myc/genética , Proteína Supressora de Tumor p53/genética , Adulto , Idoso , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , DNA de Neoplasias/genética , Feminino , Citometria de Fluxo , Amplificação de Genes , Humanos , Hibridização in Situ Fluorescente , Pessoa de Meia-Idade , Prognóstico , Proteínas Proto-Oncogênicas c-myc/metabolismo , Proteína Supressora de Tumor p53/metabolismoRESUMO
Intratumor heterogeneity is a major challenge in cancer treatment. To decipher patterns of chromosomal heterogeneity, we analyzed six colorectal cancer cell lines by multiplex interphase FISH (miFISH). The mismatch-repair-deficient cell lines DLD-1 and HCT116 had the most stable copy numbers, whereas aneuploid cell lines (HT-29, SW480, SW620 and H508) displayed a higher degree of instability. We subsequently assessed the clonal evolution of single cells in two colorectal carcinoma cell lines, SW480 and HT-29, which both have aneuploid karyotypes but different degrees of chromosomal instability. The clonal compositions of the single cell-derived daughter lines, as assessed by miFISH, differed for HT-29 and SW480. Daughters of HT-29 were stable, clonal, with little heterogeneity. Daughters of SW480 were more heterogeneous, with the single cell-derived daughter lines separating into two distinct populations with different ploidy (hyper-diploid and near-triploid), morphology, gene expression and tumorigenicity. To better understand the evolutionary trajectory for the two SW480 populations, we constructed phylogenetic trees which showed ongoing instability in the daughter lines. When analyzing the evolutionary development over time, most single cell-derived daughter lines maintained their major clonal pattern, with the exception of one daughter line that showed a switch involving a loss of APC. Our meticulous analysis of the clonal evolution and composition of these colorectal cancer models shows that all chromosomes are subject to segregation errors, however, specific net genomic imbalances are maintained. Karyotype evolution is driven by the necessity to arrive at and maintain a specific plateau of chromosomal copy numbers as the drivers of carcinogenesis.