Search | Nursing VHL Search Portal

1.

Megabase Length Hypermutation Accompanies Human Structural Variation at 17p11.2.

Beck, Christine R; Carvalho, Claudia M B; Akdemir, Zeynep C; Sedlazeck, Fritz J; Song, Xiaofei; Meng, Qingchang; Hu, Jianhong; Doddapaneni, Harsha; Chong, Zechen; Chen, Edward S; Thornton, Philip C; Liu, Pengfei; Yuan, Bo; Withers, Marjorie; Jhangiani, Shalini N; Kalra, Divya; Walker, Kimberly; English, Adam C; Han, Yi; Chen, Ken; Muzny, Donna M; Ira, Grzegorz; Shaw, Chad A; Gibbs, Richard A; Hastings, P J; Lupski, James R.

Cell ; 176(6): 1310-1324.e10, 2019 03 07.

Article in English | MEDLINE | ID: mdl-30827684

ABSTRACT

DNA rearrangements resulting in human genome structural variants (SVs) are caused by diverse mutational mechanisms. We used long- and short-read sequencing technologies to investigate end products of de novo chromosome 17p11.2 rearrangements and query the molecular mechanisms underlying both recurrent and non-recurrent events. Evidence for an increased rate of clustered single-nucleotide variant (SNV) mutation in cis with non-recurrent rearrangements was found. Indel and SNV formation are associated with both copy-number gains and losses of 17p11.2, occur up to â¼1 Mb away from the breakpoint junctions, and favor C > G transversion substitutions; results suggest that single-stranded DNA is formed during the genesis of the SV and provide compelling support for a microhomology-mediated break-induced replication (MMBIR) mechanism for SV formation. Our data show an additional mutational burden of MMBIR consisting of hypermutation confined to the locus and manifesting as SNVs and indels predominantly within genes.

Subject(s)

Chromosomes, Human, Pair 17 , Mutation , Abnormalities, Multiple/genetics , Chromosome Breakpoints , Chromosome Disorders/genetics , Chromosome Duplication/genetics , DNA Copy Number Variations , DNA Repair/genetics , DNA Replication , Gene Rearrangement , Genome, Human , Genomic Structural Variation , Humans , INDEL Mutation , Models, Genetic , Polymorphism, Single Nucleotide , Recombination, Genetic , Sequence Analysis, DNA/methods , Smith-Magenis Syndrome/genetics

2.

Comprehensive Characterization of Cancer Driver Genes and Mutations.

Bailey, Matthew H; Tokheim, Collin; Porta-Pardo, Eduard; Sengupta, Sohini; Bertrand, Denis; Weerasinghe, Amila; Colaprico, Antonio; Wendl, Michael C; Kim, Jaegil; Reardon, Brendan; Ng, Patrick Kwok-Shing; Jeong, Kang Jin; Cao, Song; Wang, Zixing; Gao, Jianjiong; Gao, Qingsong; Wang, Fang; Liu, Eric Minwei; Mularoni, Loris; Rubio-Perez, Carlota; Nagarajan, Niranjan; Cortés-Ciriano, Isidro; Zhou, Daniel Cui; Liang, Wen-Wei; Hess, Julian M; Yellapantula, Venkata D; Tamborero, David; Gonzalez-Perez, Abel; Suphavilai, Chayaporn; Ko, Jia Yu; Khurana, Ekta; Park, Peter J; Van Allen, Eliezer M; Liang, Han; Lawrence, Michael S; Godzik, Adam; Lopez-Bigas, Nuria; Stuart, Josh; Wheeler, David; Getz, Gad; Chen, Ken; Lazar, Alexander J; Mills, Gordon B; Karchin, Rachel; Ding, Li.

Cell ; 173(2): 371-385.e18, 2018 04 05.

Article in English | MEDLINE | ID: mdl-29625053

ABSTRACT

Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.

Subject(s)

Neoplasms/pathology , Algorithms , B7-H1 Antigen/genetics , Computational Biology , Databases, Genetic , Entropy , Humans , Microsatellite Instability , Mutation , Neoplasms/genetics , Neoplasms/immunology , Principal Component Analysis , Programmed Cell Death 1 Receptor/genetics

3.

Pathogenic Germline Variants in 10,389 Adult Cancers.

Huang, Kuan-Lin; Mashl, R Jay; Wu, Yige; Ritter, Deborah I; Wang, Jiayin; Oh, Clara; Paczkowska, Marta; Reynolds, Sheila; Wyczalkowski, Matthew A; Oak, Ninad; Scott, Adam D; Krassowski, Michal; Cherniack, Andrew D; Houlahan, Kathleen E; Jayasinghe, Reyka; Wang, Liang-Bo; Zhou, Daniel Cui; Liu, Di; Cao, Song; Kim, Young Won; Koire, Amanda; McMichael, Joshua F; Hucthagowder, Vishwanathan; Kim, Tae-Beom; Hahn, Abigail; Wang, Chen; McLellan, Michael D; Al-Mulla, Fahd; Johnson, Kimberly J; Lichtarge, Olivier; Boutros, Paul C; Raphael, Benjamin; Lazar, Alexander J; Zhang, Wei; Wendl, Michael C; Govindan, Ramaswamy; Jain, Sanjay; Wheeler, David; Kulkarni, Shashikant; Dipersio, John F; Reimand, Jüri; Meric-Bernstam, Funda; Chen, Ken; Shmulevich, Ilya; Plon, Sharon E; Chen, Feng; Ding, Li.

Cell ; 173(2): 355-370.e14, 2018 04 05.

Article in English | MEDLINE | ID: mdl-29625052

ABSTRACT

We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer.

Subject(s)

Germ Cells/metabolism , Neoplasms/pathology , DNA Copy Number Variations , Databases, Genetic , Gene Deletion , Gene Frequency , Genetic Predisposition to Disease , Genotype , Germ Cells/cytology , Germ-Line Mutation , Humans , Loss of Heterozygosity/genetics , Mutation, Missense , Neoplasms/genetics , Polymorphism, Single Nucleotide , Proto-Oncogene Proteins c-met/genetics , Proto-Oncogene Proteins c-ret/genetics , Tumor Suppressor Proteins/genetics

4.

An Organismal CNV Mutator Phenotype Restricted to Early Human Development.

Liu, Pengfei; Yuan, Bo; Carvalho, Claudia M B; Wuster, Arthur; Walter, Klaudia; Zhang, Ling; Gambin, Tomasz; Chong, Zechen; Campbell, Ian M; Coban Akdemir, Zeynep; Gelowani, Violet; Writzl, Karin; Bacino, Carlos A; Lindsay, Sarah J; Withers, Marjorie; Gonzaga-Jauregui, Claudia; Wiszniewska, Joanna; Scull, Jennifer; Stankiewicz, Pawel; Jhangiani, Shalini N; Muzny, Donna M; Zhang, Feng; Chen, Ken; Gibbs, Richard A; Rautenstrauss, Bernd; Cheung, Sau Wai; Smith, Janice; Breman, Amy; Shaw, Chad A; Patel, Ankita; Hurles, Matthew E; Lupski, James R.

Cell ; 168(5): 830-842.e7, 2017 02 23.

Article in English | MEDLINE | ID: mdl-28235197

ABSTRACT

De novo copy number variants (dnCNVs) arising at multiple loci in a personal genome have usually been considered to reflect cancer somatic genomic instabilities. We describe a multiple dnCNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional dnCNVs. These CNVs originate from independent formation incidences, are predominantly tandem duplications or complex gains, exhibit breakpoint junction features reminiscent of replicative repair, and show increased de novo point mutations flanking the rearrangement junctions. The active CNV mutation shower appears to be restricted to a transient perizygotic period. We propose that a defect in the CNV formation process is responsible for the "CNV-mutator state," and this state is dampened after early embryogenesis. The constitutional MdnCNV phenomenon resembles chromosomal instability in various cancers. Investigations of this phenomenon may provide unique access to understanding genomic disorders, structural variant mutagenesis, human evolution, and cancer biology.

Subject(s)

Chromosome Aberrations , DNA Copy Number Variations , Genetic Diseases, Inborn/embryology , Genetic Diseases, Inborn/genetics , Genomic Instability , Mutation , Chromosome Breakpoints , Chromosome Duplication , DNA Replication , Embryonic Development , Female , Gametogenesis , Humans , Male

5.

KLF5 governs sphingolipid metabolism and barrier function of the skin.

Lyu, Ying; Guan, Yinglu; Deliu, Lisa; Humphrey, Ericka; Frontera, Joanna K; Yang, Youn Joo; Zamler, Daniel; Kim, Kun Hee; Mohanty, Vakul; Jin, Kevin; Mohanty, Vakul; Liu, Virginia; Dou, Jinzhuang; Veillon, Lucas J; Kumar, Shwetha V; Lorenzi, Philip L; Chen, Yang; McAndrews, Kathleen M; Grivennikov, Sergei; Song, Xingzhi; Zhang, Jianhua; Xi, Yuanxin; Wang, Jing; Chen, Ken; Nagarajan, Priyadharsini; Ge, Yejing.

Genes Dev ; 2022 Aug 25.

Article in English | MEDLINE | ID: mdl-36008138

ABSTRACT

Stem cells are fundamental units of tissue remodeling whose functions are dictated by lineage-specific transcription factors. Home to epidermal stem cells and their upward-stratifying progenies, skin relies on its secretory functions to form the outermost protective barrier, of which a transcriptional orchestrator has been elusive. KLF5 is a Krüppel-like transcription factor broadly involved in development and regeneration whose lineage specificity, if any, remains unclear. Here we report KLF5 specifically marks the epidermis, and its deletion leads to skin barrier dysfunction in vivo. Lipid envelopes and secretory lamellar bodies are defective in KLF5-deficient skin, accompanied by preferential loss of complex sphingolipids. KLF5 binds to and transcriptionally regulates genes encoding rate-limiting sphingolipid metabolism enzymes. Remarkably, skin barrier defects elicited by KLF5 ablation can be rescued by dietary interventions. Finally, we found that KLF5 is widely suppressed in human diseases with disrupted epidermal secretion, and its regulation of sphingolipid metabolism is conserved in human skin. Altogether, we established KLF5 as a disease-relevant transcription factor governing sphingolipid metabolism and barrier function in the skin, likely representing a long-sought secretory lineage-defining factor across tissue types.

6.

Functional-group translocation of cyano groups by reversible C-H sampling.

Chen, Ken; Zeng, Qingrui; Xie, Longhuan; Xue, Zisheng; Wang, Jianbo; Xu, Yan.

Nature ; 620(7976): 1007-1012, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37364765

ABSTRACT

Chemical transformations that introduce, remove or manipulate functional groups are ubiquitous in synthetic chemistry1. Unlike conventional functional-group interconversion reactions that swap one functionality for another, transformations that alter solely the location of functional groups are far less explored. Here, by photocatalytic, reversible C-H sampling, we report a functional-group translocation reaction of cyano (CN) groups in common nitriles, allowing for the direct positional exchange between a CN group and an unactivated C-H bond. The reaction shows high fidelity for 1,4-CN translocation, frequently contrary to inherent site selectivity in conventional C-H functionalizations. We also report the direct transannular CN translocation of cyclic systems, providing access to valuable structures that are non-trivial to obtain by other methods. Making use of the synthetic versatility of CN and a key CN translocation step, we showcase concise syntheses of building blocks of bioactive molecules. Furthermore, the combination of C-H cyanation and CN translocation allows access to unconventional C-H derivatives. Overall, the reported reaction represents a way to achieve site-selective C-H transformation reactions without requiring a site-selective C-H cleavage step.

7.

A spatially resolved single-cell genomic atlas of the adult human breast.

Kumar, Tapsi; Nee, Kevin; Wei, Runmin; He, Siyuan; Nguyen, Quy H; Bai, Shanshan; Blake, Kerrigan; Pein, Maren; Gong, Yanwen; Sei, Emi; Hu, Min; Casasent, Anna K; Thennavan, Aatish; Li, Jianzhuo; Tran, Tuan; Chen, Ken; Nilges, Benedikt; Kashikar, Nachiket; Braubach, Oliver; Ben Cheikh, Bassem; Nikulina, Nadya; Chen, Hui; Teshome, Mediget; Menegaz, Brian; Javaid, Huma; Nagi, Chandandeep; Montalvan, Jessica; Lev, Tatyana; Mallya, Sharmila; Tifrea, Delia F; Edwards, Robert; Lin, Erin; Parajuli, Ritesh; Hanson, Summer; Winocour, Sebastian; Thompson, Alastair; Lim, Bora; Lawson, Devon A; Kessenbrock, Kai; Navin, Nicholas.

Nature ; 620(7972): 181-191, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37380767

ABSTRACT

The adult human breast is comprised of an intricate network of epithelial ducts and lobules that are embedded in connective and adipose tissue1-3. Although most previous studies have focused on the breast epithelial system4-6, many of the non-epithelial cell types remain understudied. Here we constructed the comprehensive Human Breast Cell Atlas (HBCA) at single-cell and spatial resolution. Our single-cell transcriptomics study profiled 714,331 cells from 126 women, and 117,346 nuclei from 20 women, identifying 12 major cell types and 58 biological cell states. These data reveal abundant perivascular, endothelial and immune cell populations, and highly diverse luminal epithelial cell states. Spatial mapping using four different technologies revealed an unexpectedly rich ecosystem of tissue-resident immune cells, as well as distinct molecular differences between ductal and lobular regions. Collectively, these data provide a reference of the adult normal breast tissue for studying mammary biology and diseases such as breast cancer.

Subject(s)

Breast , Gene Expression Profiling , Single-Cell Analysis , Adult , Female , Humans , Breast/cytology , Breast/immunology , Breast/metabolism , Breast Neoplasms/metabolism , Breast Neoplasms/pathology , Endothelial Cells/classification , Endothelial Cells/metabolism , Epithelial Cells/classification , Epithelial Cells/metabolism , Genomics , Immunity

8.

Comprehensive Characterization of Cancer Driver Genes and Mutations.

Bailey, Matthew H; Tokheim, Collin; Porta-Pardo, Eduard; Sengupta, Sohini; Bertrand, Denis; Weerasinghe, Amila; Colaprico, Antonio; Wendl, Michael C; Kim, Jaegil; Reardon, Brendan; Kwok-Shing Ng, Patrick; Jeong, Kang Jin; Cao, Song; Wang, Zixing; Gao, Jianjiong; Gao, Qingsong; Wang, Fang; Liu, Eric Minwei; Mularoni, Loris; Rubio-Perez, Carlota; Nagarajan, Niranjan; Cortés-Ciriano, Isidro; Zhou, Daniel Cui; Liang, Wen-Wei; Hess, Julian M; Yellapantula, Venkata D; Tamborero, David; Gonzalez-Perez, Abel; Suphavilai, Chayaporn; Ko, Jia Yu; Khurana, Ekta; Park, Peter J; Van Allen, Eliezer M; Liang, Han; Lawrence, Michael S; Godzik, Adam; Lopez-Bigas, Nuria; Stuart, Josh; Wheeler, David; Getz, Gad; Chen, Ken; Lazar, Alexander J; Mills, Gordon B; Karchin, Rachel; Ding, Li.

Cell ; 174(4): 1034-1035, 2018 08 09.

Article in English | MEDLINE | ID: mdl-30096302

9.

The Immune Landscape of Cancer.

Thorsson, Vésteinn; Gibbs, David L; Brown, Scott D; Wolf, Denise; Bortone, Dante S; Ou Yang, Tai-Hsien; Porta-Pardo, Eduard; Gao, Galen F; Plaisier, Christopher L; Eddy, James A; Ziv, Elad; Culhane, Aedin C; Paull, Evan O; Sivakumar, I K Ashok; Gentles, Andrew J; Malhotra, Raunaq; Farshidfar, Farshad; Colaprico, Antonio; Parker, Joel S; Mose, Lisle E; Vo, Nam Sy; Liu, Jianfang; Liu, Yuexin; Rader, Janet; Dhankani, Varsha; Reynolds, Sheila M; Bowlby, Reanne; Califano, Andrea; Cherniack, Andrew D; Anastassiou, Dimitris; Bedognetti, Davide; Mokrab, Younes; Newman, Aaron M; Rao, Arvind; Chen, Ken; Krasnitz, Alexander; Hu, Hai; Malta, Tathiane M; Noushmehr, Houtan; Pedamallu, Chandra Sekhar; Bullman, Susan; Ojesina, Akinyemi I; Lamb, Andrew; Zhou, Wanding; Shen, Hui; Choueiri, Toni K; Weinstein, John N; Guinney, Justin; Saltz, Joel; Holt, Robert A.

Immunity ; 48(4): 812-830.e14, 2018 04 17.

Article in English | MEDLINE | ID: mdl-29628290

ABSTRACT

We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes-wound healing, IFN-Î³ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-ß dominant-characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tumors and the resulting data are intended to serve as a resource for future targeted studies to further advance the field.

Subject(s)

Genomics/methods , Neoplasms , Adolescent , Adult , Aged , Aged, 80 and over , Child , Female , Humans , Interferon-gamma/genetics , Interferon-gamma/immunology , Macrophages/immunology , Male , Middle Aged , Neoplasms/classification , Neoplasms/genetics , Neoplasms/immunology , Prognosis , Th1-Th2 Balance/physiology , Transforming Growth Factor beta/genetics , Transforming Growth Factor beta/immunology , Wound Healing/genetics , Wound Healing/immunology , Young Adult

10.

Genomic landscape of non-small cell lung cancer in smokers and never-smokers.

Govindan, Ramaswamy; Ding, Li; Griffith, Malachi; Subramanian, Janakiraman; Dees, Nathan D; Kanchi, Krishna L; Maher, Christopher A; Fulton, Robert; Fulton, Lucinda; Wallis, John; Chen, Ken; Walker, Jason; McDonald, Sandra; Bose, Ron; Ornitz, David; Xiong, Donghai; You, Ming; Dooling, David J; Watson, Mark; Mardis, Elaine R; Wilson, Richard K.

Cell ; 150(6): 1121-34, 2012 Sep 14.

Article in English | MEDLINE | ID: mdl-22980976

ABSTRACT

We report the results of whole-genome and transcriptome sequencing of tumor and adjacent normal tissue samples from 17 patients with non-small cell lung carcinoma (NSCLC). We identified 3,726 point mutations and more than 90 indels in the coding sequence, with an average mutation frequency more than 10-fold higher in smokers than in never-smokers. Novel alterations in genes involved in chromatin modification and DNA repair pathways were identified, along with DACH1, CFTR, RELN, ABCB5, and HGF. Deep digital sequencing revealed diverse clonality patterns in both never-smokers and smokers. All validated EFGR and KRAS mutations were present in the founder clones, suggesting possible roles in cancer initiation. Analysis revealed 14 fusions, including ROS1 and ALK, as well as novel metabolic enzymes. Cell-cycle and JAK-STAT pathways are significantly altered in lung cancer, along with perturbations in 54 genes that are potentially targetable with currently available drugs.

Subject(s)

Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/pathology , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Smoking/genetics , Smoking/pathology , Carcinoma, Non-Small-Cell Lung/therapy , Chromosome Aberrations , Female , Gene Expression Profiling , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation , Lung Neoplasms/therapy , Male , Molecular Targeted Therapy , Point Mutation , Reelin Protein

11.

The origin and evolution of mutations in acute myeloid leukemia.

Welch, John S; Ley, Timothy J; Link, Daniel C; Miller, Christopher A; Larson, David E; Koboldt, Daniel C; Wartman, Lukas D; Lamprecht, Tamara L; Liu, Fulu; Xia, Jun; Kandoth, Cyriac; Fulton, Robert S; McLellan, Michael D; Dooling, David J; Wallis, John W; Chen, Ken; Harris, Christopher C; Schmidt, Heather K; Kalicki-Veizer, Joelle M; Lu, Charles; Zhang, Qunyuan; Lin, Ling; O'Laughlin, Michelle D; McMichael, Joshua F; Delehaunty, Kim D; Fulton, Lucinda A; Magrini, Vincent J; McGrath, Sean D; Demeter, Ryan T; Vickery, Tammi L; Hundal, Jasreet; Cook, Lisa L; Swift, Gary W; Reed, Jerry P; Alldredge, Patricia A; Wylie, Todd N; Walker, Jason R; Watson, Mark A; Heath, Sharon E; Shannon, William D; Varghese, Nobish; Nagarajan, Rakesh; Payton, Jacqueline E; Baty, Jack D; Kulkarni, Shashikant; Klco, Jeffery M; Tomasson, Michael H; Westervelt, Peter; Walter, Matthew J; Graubert, Timothy A.

Cell ; 150(2): 264-78, 2012 Jul 20.

Article in English | MEDLINE | ID: mdl-22817890

ABSTRACT

Most mutations in cancer genomes are thought to be acquired after the initiating event, which may cause genomic instability and drive clonal evolution. However, for acute myeloid leukemia (AML), normal karyotypes are common, and genomic instability is unusual. To better understand clonal evolution in AML, we sequenced the genomes of M3-AML samples with a known initiating event (PML-RARA) versus the genomes of normal karyotype M1-AML samples and the exomes of hematopoietic stem/progenitor cells (HSPCs) from healthy people. Collectively, the data suggest that most of the mutations found in AML genomes are actually random events that occurred in HSPCs before they acquired the initiating mutation; the mutational history of that cell is "captured" as the clone expands. In many cases, only one or two additional, cooperating mutations are needed to generate the malignant founding clone. Cells from the founding clone can acquire additional cooperating mutations, yielding subclones that can contribute to disease progression and/or relapse.

Subject(s)

Clonal Evolution , Leukemia, Myeloid, Acute/genetics , Mutation , Adult , Aged , DNA Mutational Analysis , Disease Progression , Female , Genome-Wide Association Study , Hematopoietic Stem Cells/metabolism , Humans , Leukemia, Myeloid, Acute/physiopathology , Male , Middle Aged , Oncogene Proteins, Fusion/genetics , Recurrence , Skin/metabolism , Young Adult

12.

Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction.

Chen, Ken; Zhou, Yue; Ding, Maolin; Wang, Yu; Ren, Zhixiang; Yang, Yuedong.

Brief Bioinform ; 25(3)2024 Mar 27.

Article in English | MEDLINE | ID: mdl-38605640

ABSTRACT

Language models pretrained by self-supervised learning (SSL) have been widely utilized to study protein sequences, while few models were developed for genomic sequences and were limited to single species. Due to the lack of genomes from different species, these models cannot effectively leverage evolutionary information. In this study, we have developed SpliceBERT, a language model pretrained on primary ribonucleic acids (RNA) sequences from 72 vertebrates by masked language modeling, and applied it to sequence-based modeling of RNA splicing. Pretraining SpliceBERT on diverse species enables effective identification of evolutionarily conserved elements. Meanwhile, the learned hidden states and attention weights can characterize the biological properties of splice sites. As a result, SpliceBERT was shown effective on several downstream tasks: zero-shot prediction of variant effects on splicing, prediction of branchpoints in humans, and cross-species prediction of splice sites. Our study highlighted the importance of pretraining genomic language models on a diverse range of species and suggested that SSL is a promising approach to enhance our understanding of the regulatory logic underlying genomic sequences.

Subject(s)

RNA Splicing , Vertebrates , Animals , Humans , Base Sequence , Vertebrates/genetics , RNA , Supervised Machine Learning

13.

Fast and accurate protein intrinsic disorder prediction by using a pretrained language model.

Song, Yidong; Yuan, Qianmu; Chen, Sheng; Chen, Ken; Zhou, Yaoqi; Yang, Yuedong.

Brief Bioinform ; 24(4)2023 07 20.

Article in English | MEDLINE | ID: mdl-37204193

ABSTRACT

Determining intrinsically disordered regions of proteins is essential for elucidating protein biological functions and the mechanisms of their associated diseases. As the gap between the number of experimentally determined protein structures and the number of protein sequences continues to grow exponentially, there is a need for developing an accurate and computationally efficient disorder predictor. However, current single-sequence-based methods are of low accuracy, while evolutionary profile-based methods are computationally intensive. Here, we proposed a fast and accurate protein disorder predictor LMDisorder that employed embedding generated by unsupervised pretrained language models as features. We showed that LMDisorder performs best in all single-sequence-based methods and is comparable or better than another language-model-based technique in four independent test sets, respectively. Furthermore, LMDisorder showed equivalent or even better performance than the state-of-the-art profile-based technique SPOT-Disorder2. In addition, the high computation efficiency of LMDisorder enabled proteome-scale analysis of human, showing that proteins with high predicted disorder content were associated with specific biological functions. The datasets, the source codes, and the trained model are available at https://github.com/biomed-AI/LMDisorder.

Subject(s)

Proteome , Software , Humans , Amino Acid Sequence , Biological Evolution

14.

Prioritizing genomic variants pathogenicity via DNA, RNA, and protein-level features based on extreme gradient boosting.

Ding, Maolin; Chen, Ken; Yang, Yuedong; Zhao, Huiying.

Hum Genet ; 2024 Apr 04.

Article in English | MEDLINE | ID: mdl-38575818

ABSTRACT

Genetic diseases are mostly implicated with genetic variants, including missense, synonymous, non-sense, and copy number variants. These different kinds of variants are indicated to affect phenotypes in various ways from previous studies. It remains essential but challenging to understand the functional consequences of these genetic variants, especially the noncoding ones, due to the lack of corresponding annotations. While many computational methods have been proposed to identify the risk variants. Most of them have only curated DNA-level and protein-level annotations to predict the pathogenicity of the variants, and others have been restricted to missense variants exclusively. In this study, we have curated DNA-, RNA-, and protein-level features to discriminate disease-causing variants in both coding and noncoding regions, where the features of protein sequences and protein structures have been shown essential for analyzing missense variants in coding regions while the features related to RNA-splicing and RBP binding are significant for variants in noncoding regions and synonymous variants in coding regions. Through the integration of these features, we have formulated the Multi-level feature Genomic Variants Predictor (ML-GVP) using the gradient boosting tree. The method has been trained on more than 400,000 variants in the Sherloc-training set from the 6th critical assessment of genome interpretation with superior performance. The method is one of the two best-performing predictors on the blind test in the Sherloc assessment, and is further confirmed by another independent test dataset of de novo variants.

15.

VCAT: an integrated variant function annotation tools.

Huang, Bi; Fan, Cong; Chen, Ken; Rao, Jiahua; Ou, Peihua; Tian, Chong; Yang, Yuedong; Cooper, David N; Zhao, Huiying.

Hum Genet ; 2024 Aug 27.

Article in English | MEDLINE | ID: mdl-39192052

ABSTRACT

The development of sequencing technology has promoted discovery of variants in the human genome. Identifying functions of these variants is important for us to link genotype to phenotype, and to diagnose diseases. However, it usually requires researchers to visit multiple databases. Here, we presented a one-stop webserver for variant function annotation tools (VCAT, https://biomed.nscc-gz.cn/zhaolab/VCAT/ ) that is the first one connecting variant to functions via the epigenome, protein, drug and RNA. VCAT is also the first one to make all annotations visualized in interactive charts or molecular structures. VCAT allows users to upload data in VCF format, and download results via a URL. Moreover, VCAT has annotated a huge number (1,262,041,068) of variants collected from dbSNP, 1000 Genomes projects, gnomAD, ICGC, TCGA, and HPRC Pangenome project. For these variants, users are able to searcher their functions, related diseases and drugs from VCAT. In summary, VCAT provides a one-stop webserver to explore the potential functions of human genomic variants including their relationship with diseases and drugs.

16.

Multifunctional Wearable Electronic Based on Fabric Modified by PPy/NiCoAl-LDH for Energy Storage, Electromagnetic Interference Shielding, and Photothermal Conversion.

Lyu, Bin; Chen, Ken; Zhu, Jiamin; Gao, Dangge.

Small ; : e2402510, 2024 Jul 10.

Article in English | MEDLINE | ID: mdl-38984762

ABSTRACT

With the rapid advancement of electronic technology, traditional textiles are challenged to keep up with the demands of wearable electronics. It is anticipated that multifunctional textile-based electronics incorporating energy storage, electromagnetic interference (EMI) shielding, and photothermal conversion are expected to alleviate this problem. Herein, a multifunctional cotton fabric with hierarchical array structure (PPy/NiCoAl-LDH/Cotton) is fabricated by the introduction of NiCoAl-layered double hydroxide (NiCoAl-LDH) nanosheet arrays on cotton fibers, followed by polymerization and growth of continuous dense polypyrrole (PPy) conductive layers. The multifunctional cotton fabric shows a high specific areal capacitance of 754.72 mF cm-2 at 5 mA cm-2 and maintains a long cycling life (80.95% retention after 1000 cycles). The symmetrical supercapacitor assembled with this fabric achieves an energy density of 20.83 Wh cm-2 and a power density of 0.23 mWcm-2. Moreover, the excellent electromagnetic interference shielding (38.83 dB), photothermal conversion (70.2 °C at 1000 mW cm-2), flexibility and durability are also possess by the multifunctional cotton fabric. Such a multifunctional cotton fabric has great potential for using in new energy, smart electronics, and thermal management applications.

17.

Capturing large genomic contexts for accurately predicting enhancer-promoter interactions.

Chen, Ken; Zhao, Huiying; Yang, Yuedong.

Brief Bioinform ; 23(2)2022 03 10.

Article in English | MEDLINE | ID: mdl-35062021

ABSTRACT

Enhancer-promoter interaction (EPI) is a key mechanism underlying gene regulation. EPI prediction has always been a challenging task because enhancers could regulate promoters of distant target genes. Although many machine learning models have been developed, they leverage only the features in enhancers and promoters, or simply add the average genomic signals in the regions between enhancers and promoters, without utilizing detailed features between or outside enhancers and promoters. Due to a lack of large-scale features, existing methods could achieve only moderate performance, especially for predicting EPIs in different cell types. Here, we present a Transformer-based model, TransEPI, for EPI prediction by capturing large genomic contexts. TransEPI was developed based on EPI datasets derived from Hi-C or ChIA-PET data in six cell lines. To avoid over-fitting, we evaluated the TransEPI model by testing it on independent test datasets where the cell line and chromosome are different from the training data. TransEPI not only achieved consistent performance across the cross-validation and test datasets from different cell types but also outperformed the state-of-the-art machine learning and deep learning models. In addition, we found that the improved performance of TransEPI was attributed to the integration of large genomic contexts. Lastly, TransEPI was extended to study the non-coding mutations associated with brain disorders or neural diseases, and we found that TransEPI was also useful for predicting the target genes of non-coding mutations.

Subject(s)

Enhancer Elements, Genetic , Genomics , Cell Line , Genomics/methods , Machine Learning , Promoter Regions, Genetic

18.

Blinatumomab maintenance after allogeneic hematopoietic cell transplantation for B-lineage acute lymphoblastic leukemia.

Gaballa, Mahmoud R; Banerjee, Pinaki; Milton, Denái R; Jiang, Xianli; Ganesh, Christina; Khazal, Sajad; Nandivada, Vandana; Islam, Sanjida; Kaplan, Mecit; Daher, May; Basar, Rafet; Alousi, Amin; Mehta, Rohtesh; Alatrash, Gheath; Khouri, Issa; Oran, Betul; Marin, David; Popat, Uday; Olson, Amanda; Tewari, Priti; Jain, Nitin; Jabbour, Elias; Ravandi, Farhad; Kantarjian, Hagop; Chen, Ken; Champlin, Richard; Shpall, Elizabeth; Rezvani, Katayoun; Kebriaei, Partow.

Blood ; 139(12): 1908-1919, 2022 03 24.

Article in English | MEDLINE | ID: mdl-34914826

ABSTRACT

Patients with B-lineage acute lymphoblastic leukemia (ALL) are at high-risk for relapse after allogeneic hematopoietic cell transplantation (HCT). We conducted a single-center phase 2 study evaluating the feasibility of 4 cycles of blinatumomab administered every 3 months during the first year after HCT in an effort to mitigate relapse in high-risk ALL patients. Twenty-one of 23 enrolled patients received at least 1 cycle of blinatumomab and were included in the analysis. The median time from HCT to the first cycle of blinatumomab was 78 days (range, 44 to 105). Twelve patients (57%) completed all 4 treatment cycles. Neutropenia was the only grade 4 adverse event (19%). Rates of cytokine release (5% G1) and neurotoxicity (5% G2) were minimal. The cumulative incidence of acute graft-versus-host disease (GVHD) grades 2 to 4 and 3 to 4 were 33% and 5%, respectively; 2 cases of mild (10%) and 1 case of moderate (5%) chronic GVHD were noted. With a median follow-up of 14.3 months, the 1-year overall survival (OS), progression-free survival (PFS), and nonrelapse mortality (NRM) rates were 85%, 71%, and 0%, respectively. In a matched analysis with a contemporary cohort of 57 patients, we found no significant difference between groups regarding blinatumomab's efficacy. Correlative studies of baseline and posttreatment samples identified patients with specific T-cell profiles as "responders" or "nonresponders" to therapy. Responders had higher proportions of effector memory CD8 T-cell subsets. Nonresponders were T-cell deficient and expressed more inhibitory checkpoint molecules, including T-cell immunoglobulin and mucin domain 3 (TIM3). We found that blinatumomab postallogeneic HCT is feasible, and its benefit is dependent on the immune milieu at time of treatment. This paper is posted on ClinicalTrials.gov, study ID: NCT02807883.

Subject(s)

Graft vs Host Disease , Hematopoietic Stem Cell Transplantation , Lymphoma, B-Cell , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Acute Disease , Antibodies, Bispecific , Hematopoietic Stem Cell Transplantation/adverse effects , Humans , Precursor Cell Lymphoblastic Leukemia-Lymphoma/therapy , Recurrence

19.

Irisin attenuates type 1 diabetic cardiomyopathy by anti-ferroptosis via SIRT1-mediated deacetylation of p53.

Tang, Yuan-Juan; Zhang, Zhen; Yan, Tong; Chen, Ken; Xu, Guo-Fan; Xiong, Shi-Qiang; Wu, Dai-Qian; Chen, Jie; Jose, Pedro A; Zeng, Chun-Yu; Fu, Jin-Juan.

Cardiovasc Diabetol ; 23(1): 116, 2024 Apr 02.

Article in English | MEDLINE | ID: mdl-38566123

ABSTRACT

BACKGROUND: Diabetic cardiomyopathy (DCM) is a serious complication in patients with type 1 diabetes mellitus (T1DM), which still lacks adequate therapy. Irisin, a cleavage peptide off fibronectin type III domain-containing 5, has been shown to preserve cardiac function in cardiac ischemia-reperfusion injury. Whether or not irisin plays a cardioprotective role in DCM is not known. METHODS AND RESULTS: T1DM was induced by multiple low-dose intraperitoneal injections of streptozotocin (STZ). Our current study showed that irisin expression/level was lower in the heart and serum of mice with STZ-induced TIDM. Irisin supplementation by intraperitoneal injection improved the impaired cardiac function in mice with DCM, which was ascribed to the inhibition of ferroptosis, because the increased ferroptosis, associated with increased cardiac malondialdehyde (MDA), decreased reduced glutathione (GSH) and protein expressions of solute carrier family 7 member 11 (SLC7A11) and glutathione peroxidase 4 (GPX4), was ameliorated by irisin. In the presence of erastin, a ferroptosis inducer, the irisin-mediated protective effects were blocked. Mechanistically, irisin treatment increased Sirtuin 1 (SIRT1) and decreased p53 K382 acetylation, which decreased p53 protein expression by increasing its degradation, consequently upregulated SLC7A11 and GPX4 expressions. Thus, irisin-mediated reduction in p53 decreases ferroptosis and protects cardiomyocytes against injury due to high glucose. CONCLUSION: This study demonstrated that irisin could improve cardiac function by suppressing ferroptosis in T1DM via the SIRT1-p53-SLC7A11/GPX4 pathway. Irisin may be a therapeutic approach in the management of T1DM-induced cardiomyopathy.

Subject(s)

Diabetes Mellitus, Type 1 , Diabetic Cardiomyopathies , Ferroptosis , Humans , Animals , Mice , Diabetic Cardiomyopathies/drug therapy , Diabetic Cardiomyopathies/etiology , Diabetic Cardiomyopathies/prevention & control , Sirtuin 1 , Fibronectins , Diabetes Mellitus, Type 1/complications , Diabetes Mellitus, Type 1/drug therapy , Tumor Suppressor Protein p53 , Myocytes, Cardiac

20.

Forecasting acute kidney injury and resource utilization in ICU patients using longitudinal, multimodal models.

Tan, Yukun; Dede, Merve; Mohanty, Vakul; Dou, Jinzhuang; Hill, Holly; Bernstam, Elmer; Chen, Ken.

J Biomed Inform ; 154: 104648, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38692464

ABSTRACT

BACKGROUND: Advances in artificial intelligence (AI) have realized the potential of revolutionizing healthcare, such as predicting disease progression via longitudinal inspection of Electronic Health Records (EHRs) and lab tests from patients admitted to Intensive Care Units (ICU). Although substantial literature exists addressing broad subjects, including the prediction of mortality, length-of-stay, and readmission, studies focusing on forecasting Acute Kidney Injury (AKI), specifically dialysis anticipation like Continuous Renal Replacement Therapy (CRRT) are scarce. The technicality of how to implement AI remains elusive. OBJECTIVE: This study aims to elucidate the important factors and methods that are required to develop effective predictive models of AKI and CRRT for patients admitted to ICU, using EHRs in the Medical Information Mart for Intensive Care (MIMIC) database. METHODS: We conducted a comprehensive comparative analysis of established predictive models, considering both time-series measurements and clinical notes from MIMIC-IV databases. Subsequently, we proposed a novel multi-modal model which integrates embeddings of top-performing unimodal models, including Long Short-Term Memory (LSTM) and BioMedBERT, and leverages both unstructured clinical notes and structured time series measurements derived from EHRs to enable the early prediction of AKI and CRRT. RESULTS: Our multimodal model achieved a lead time of at least 12 h ahead of clinical manifestation, with an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.888 for AKI and 0.997 for CRRT, as well as an Area Under the Precision Recall Curve (AUPRC) of 0.727 for AKI and 0.840 for CRRT, respectively, which significantly outperformed the baseline models. Additionally, we performed a SHapley Additive exPlanation (SHAP) analysis using the expected gradients algorithm, which highlighted important, previously underappreciated predictive features for AKI and CRRT. CONCLUSION: Our study revealed the importance and the technicality of applying longitudinal, multimodal modeling to improve early prediction of AKI and CRRT, offering insights for timely interventions. The performance and interpretability of our model indicate its potential for further assessment towards clinical applications, to ultimately optimize AKI management and enhance patient outcomes.

Subject(s)

Acute Kidney Injury , Electronic Health Records , Intensive Care Units , Acute Kidney Injury/therapy , Humans , Longitudinal Studies , Renal Replacement Therapy , Artificial Intelligence , Forecasting , Length of Stay , Male , Databases, Factual , Female

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL