Pesquisa | BVS Violência e Saúde

A genome-wide spectrum of tandem repeat expansions in 338,963 humans.

Cui, Ya; Ye, Wenbin; Li, Jason Sheng; Li, Jingyi Jessica; Vilain, Eric; Sallam, Tamer; Li, Wei.

Cell ; 187(9): 2336-2341.e5, 2024 Apr 25.

Artigo em Inglês | MEDLINE | ID: mdl-38582080

RESUMO

The Genome Aggregation Database (gnomAD), widely recognized as the gold-standard reference map of human genetic variation, has largely overlooked tandem repeat (TR) expansions, despite the fact that TRs constitute â¼6% of our genome and are linked to over 50 human diseases. Here, we introduce the TR-gnomAD (https://wlcb.oit.uci.edu/TRgnomAD), a biobank-scale reference of 0.86 million TRs derived from 338,963 whole-genome sequencing (WGS) samples of diverse ancestries (39.5% non-European samples). TR-gnomAD offers critical insights into ancestry-specific disease prevalence using disparities in TR unit number frequencies among ancestries. Moreover, TR-gnomAD is able to differentiate between common, presumably benign TR expansions, which are prevalent in TR-gnomAD, from those potentially pathogenic TR expansions, which are found more frequently in disease groups than within TR-gnomAD. Together, TR-gnomAD is an invaluable resource for researchers and physicians to interpret TR expansions in individuals with genetic diseases.

Assuntos

Genoma Humano , Sequências de Repetição em Tandem , Humanos , Sequências de Repetição em Tandem/genética , Sequenciamento Completo do Genoma , Bases de Dados Genéticas , Expansão das Repetições de DNA/genética , Estudo de Associação Genômica Ampla

Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.

Xia, Lucy; Lee, Christy; Li, Jingyi Jessica.

Nat Commun ; 15(1): 1753, 2024 Feb 26.

Artigo em Inglês | MEDLINE | ID: mdl-38409103

RESUMO

Two-dimensional (2D) embedding methods are crucial for single-cell data visualization. Popular methods such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) are commonly used for visualizing cell clusters; however, it is well known that t-SNE and UMAP's 2D embeddings might not reliably inform the similarities among cell clusters. Motivated by this challenge, we present a statistical method, scDEED, for detecting dubious cell embeddings output by a 2D-embedding method. By calculating a reliability score for every cell embedding based on the similarity between the cell's 2D-embedding neighbors and pre-embedding neighbors, scDEED identifies the cell embeddings with low reliability scores as dubious and those with high reliability scores as trustworthy. Moreover, by minimizing the number of dubious cell embeddings, scDEED provides intuitive guidance for optimizing the hyperparameters of an embedding method. We show the effectiveness of scDEED on multiple datasets for detecting dubious cell embeddings and optimizing the hyperparameters of t-SNE and UMAP.

Assuntos

Algoritmos , Reprodutibilidade dos Testes

Categorization of 31 computational methods to detect spatially variable genes from spatially resolved transcriptomics data.

Yan, Guanao; Hua, Shuo Harper; Li, Jingyi Jessica.

ArXiv ; 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38855546

RESUMO

In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 31 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.

scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data.

Wang, Weijian; Cen, Yihui; Lu, Zezhen; Xu, Yueqing; Sun, Tianyi; Xiao, Ying; Liu, Wanlu; Li, Jingyi Jessica; Wang, Chaochen.

Genome Biol ; 25(1): 136, 2024 05 23.

Artigo em Inglês | MEDLINE | ID: mdl-38783325

RESUMO

In droplet-based single-cell and single-nucleus RNA-seq assays, systematic contamination of ambient RNA molecules biases the quantification of gene expression levels. Existing methods correct the contamination for all genes globally. However, there lacks specific evaluation of correction efficacy for varying contamination levels. Here, we show that DecontX and CellBender under-correct highly contaminating genes, while SoupX and scAR over-correct lowly/non-contaminating genes. Here, we develop scCDC as the first method to detect the contamination-causing genes and only correct expression levels of these genes, some of which are cell-type markers. Compared with existing decontamination methods, scCDC excels in decontaminating highly contaminating genes while avoiding over-correction of other genes.

Assuntos

RNA-Seq , Análise de Célula Única , Análise de Célula Única/métodos , RNA-Seq/métodos , Humanos , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Núcleo Celular/genética , Software , Animais

Targeting circadian transcriptional programs in triple negative breast cancer through a cis-regulatory mechanism.

Pan, Yuanzhong; Chiu, Tsu-Pei; Zhou, Lili; Chan, Priscilla; Kuo, Tia Tyrsett; Battaglin, Francesca; Soni, Shivani; Jayachandran, Priya; Li, Jingyi Jessica; Lenz, Heinz-Josef; Mumenthaler, Shannon M; Rohs, Remo; Torres, Evanthia Roussos; Kay, Steve A.

bioRxiv ; 2024 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-38746115

RESUMO

Circadian clock genes are emerging targets in many types of cancer, but their mechanistic contributions to tumor progression are still largely unknown. This makes it challenging to stratify patient populations and develop corresponding treatments. In this work, we show that in breast cancer, the disrupted expression of circadian genes has the potential to serve as biomarkers. We also show that the master circadian transcription factors (TFs) BMAL1 and CLOCK are required for the proliferation of metastatic mesenchymal stem-like (mMSL) triple-negative breast cancer (TNBC) cells. Using currently available small molecule modulators, we found that a stabilizer of cryptochrome 2 (CRY2), the direct repressor of BMAL1 and CLOCK transcriptional activity, synergizes with inhibitors of proteasome, which is required for BMAL1 and CLOCK function, to repress a transcriptional program comprising circadian cycling genes in mMSL TNBC cells. Omics analyses on drug-treated cells implied that this repression of transcription is mediated by the transcription factor binding sites (TFBSs) features in the cis-regulatory elements (CRE) of clock-controlled genes. Through a massive parallel reporter assay, we defined a set of CRE features that are potentially repressed by the specific drug combination. The identification of cis -element enrichment may serve as a new way of defining and targeting tumor types through the modulation of cis -regulatory programs, and ultimately provide a new paradigm of therapy design for cancer types with unclear drivers like TNBC.

Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms.

Patowary, Ashok; Zhang, Pan; Jops, Connor; Vuong, Celine K; Ge, Xinzhou; Hou, Kangcheng; Kim, Minsoo; Gong, Naihua; Margolis, Michael; Vo, Daniel; Wang, Xusheng; Liu, Chunyu; Pasaniuc, Bogdan; Li, Jingyi Jessica; Gandal, Michael J; de la Torre-Ubieta, Luis.

Science ; 384(6698): eadh7688, 2024 May 24.

Artigo em Inglês | MEDLINE | ID: mdl-38781356

RESUMO

RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders; yet, the role of cell type-specific splicing and transcript-isoform diversity during human brain development has not been systematically investigated. In this work, we leveraged single-molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone and cortical plate regions of the developing human neocortex at tissue and single-cell resolution. We identified 214,516 distinct isoforms, of which 72.6% were novel (not previously annotated in Gencode version 33), and uncovered a substantial contribution of transcript-isoform diversity-regulated by RNA binding proteins-in defining cellular identity in the developing neocortex. We leveraged this comprehensive isoform-centric gene annotation to reprioritize thousands of rare de novo risk variants and elucidate genetic risk mechanisms for neuropsychiatric disorders.

Assuntos

Transtornos Mentais , Neocórtex , Neurogênese , Isoformas de Proteínas , Splicing de RNA , Análise de Célula Única , Transcriptoma , Humanos , Processamento Alternativo , Predisposição Genética para Doença , Transtornos Mentais/genética , Anotação de Sequência Molecular , Neocórtex/metabolismo , Neocórtex/embriologia , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Neurogênese/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA