Pesquisa | Portal de Pesquisa da BVS Enfermagem

1.

Pan-cancer proteogenomics expands the landscape of therapeutic targets.

Savage, Sara R; Yi, Xinpei; Lei, Jonathan T; Wen, Bo; Zhao, Hongwei; Liao, Yuxing; Jaehnig, Eric J; Somes, Lauren K; Shafer, Paul W; Lee, Tobie D; Fu, Zile; Dou, Yongchao; Shi, Zhiao; Gao, Daming; Hoyos, Valentina; Gao, Qiang; Zhang, Bing.

Cell ; 2024 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-38917788

RESUMO

Fewer than 200 proteins are targeted by cancer drugs approved by the Food and Drug Administration (FDA). We integrate Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteogenomics data from 1,043 patients across 10 cancer types with additional public datasets to identify potential therapeutic targets. Pan-cancer analysis of 2,863 druggable proteins reveals a wide abundance range and identifies biological factors that affect mRNA-protein correlation. Integration of proteomic data from tumors and genetic screen data from cell lines identifies protein overexpression- or hyperactivation-driven druggable dependencies, enabling accurate predictions of effective drug targets. Proteogenomic identification of synthetic lethality provides a strategy to target tumor suppressor gene loss. Combining proteogenomic analysis and MHC binding prediction prioritizes mutant KRAS peptides as promising public neoantigens. Computational identification of shared tumor-associated antigens followed by experimental confirmation nominates peptides as immunotherapy targets. These analyses, summarized at https://targets.linkedomics.org, form a comprehensive landscape of protein and peptide targets for companion diagnostics, drug repurposing, and therapy development.

2.

Automatic cell-type harmonization and integration across Human Cell Atlas datasets.

Xu, Chuan; Prete, Martin; Webb, Simone; Jardine, Laura; Stewart, Benjamin J; Hoo, Regina; He, Peng; Meyer, Kerstin B; Teichmann, Sarah A.

Cell ; 186(26): 5876-5891.e20, 2023 12 21.

Artigo em Inglês | MEDLINE | ID: mdl-38134877

RESUMO

Harmonizing cell types across the single-cell community and assembling them into a common framework is central to building a standardized Human Cell Atlas. Here, we present CellHint, a predictive clustering tree-based tool to resolve cell-type differences in annotation resolution and technical biases across datasets. CellHint accurately quantifies cell-cell transcriptomic similarities and places cell types into a relationship graph that hierarchically defines shared and unique cell subtypes. Application to multiple immune datasets recapitulates expert-curated annotations. CellHint also reveals underexplored relationships between healthy and diseased lung cell states in eight diseases. Furthermore, we present a workflow for fast cross-dataset integration guided by harmonized cell types and cell hierarchy, which uncovers underappreciated cell types in adult human hippocampus. Finally, we apply CellHint to 12 tissues from 38 datasets, providing a deeply curated cross-tissue database with â¼3.7 million cells and various machine learning models for automatic cell annotation across human tissues.

Assuntos

Perfilação da Expressão Gênica , Transcriptoma , Humanos , Bases de Dados Factuais , Análise de Célula Única

3.

Whole-body integration of gene expression and single-cell morphology.

Vergara, Hernando M; Pape, Constantin; Meechan, Kimberly I; Zinchenko, Valentyna; Genoud, Christel; Wanner, Adrian A; Mutemi, Kevin Nzumbi; Titze, Benjamin; Templin, Rachel M; Bertucci, Paola Y; Simakov, Oleg; Dürichen, Wiebke; Machado, Pedro; Savage, Emily L; Schermelleh, Lothar; Schwab, Yannick; Friedrich, Rainer W; Kreshuk, Anna; Tischer, Christian; Arendt, Detlev.

Cell ; 184(18): 4819-4837.e22, 2021 09 02.

Artigo em Inglês | MEDLINE | ID: mdl-34380046

RESUMO

Animal bodies are composed of cell types with unique expression programs that implement their distinct locations, shapes, structures, and functions. Based on these properties, cell types assemble into specific tissues and organs. To systematically explore the link between cell-type-specific gene expression and morphology, we registered an expression atlas to a whole-body electron microscopy volume of the nereid Platynereis dumerilii. Automated segmentation of cells and nuclei identifies major cell classes and establishes a link between gene activation, chromatin topography, and nuclear size. Clustering of segmented cells according to gene expression reveals spatially coherent tissues. In the brain, genetically defined groups of neurons match ganglionic nuclei with coherent projections. Besides interneurons, we uncover sensory-neurosecretory cells in the nereid mushroom bodies, which thus qualify as sensory organs. They furthermore resemble the vertebrate telencephalon by molecular anatomy. We provide an integrated browser as a Fiji plugin for remote exploration of all available multimodal datasets.

Assuntos

Forma Celular , Regulação da Expressão Gênica , Poliquetos/citologia , Poliquetos/genética , Análise de Célula Única , Animais , Núcleo Celular/metabolismo , Gânglios dos Invertebrados/metabolismo , Perfilação da Expressão Gênica , Família Multigênica , Imagem Multimodal , Corpos Pedunculados/metabolismo , Poliquetos/ultraestrutura

4.

The Human Tumor Atlas Network: Charting Tumor Transitions across Space and Time at Single-Cell Resolution.

Rozenblatt-Rosen, Orit; Regev, Aviv; Oberdoerffer, Philipp; Nawy, Tal; Hupalowska, Anna; Rood, Jennifer E; Ashenberg, Orr; Cerami, Ethan; Coffey, Robert J; Demir, Emek; Ding, Li; Esplin, Edward D; Ford, James M; Goecks, Jeremy; Ghosh, Sharmistha; Gray, Joe W; Guinney, Justin; Hanlon, Sean E; Hughes, Shannon K; Hwang, E Shelley; Iacobuzio-Donahue, Christine A; Jané-Valbuena, Judit; Johnson, Bruce E; Lau, Ken S; Lively, Tracy; Mazzilli, Sarah A; Pe'er, Dana; Santagata, Sandro; Shalek, Alex K; Schapiro, Denis; Snyder, Michael P; Sorger, Peter K; Spira, Avrum E; Srivastava, Sudhir; Tan, Kai; West, Robert B; Williams, Elizabeth H.

Cell ; 181(2): 236-249, 2020 04 16.

Artigo em Inglês | MEDLINE | ID: mdl-32302568

RESUMO

Crucial transitions in cancer-including tumor initiation, local expansion, metastasis, and therapeutic resistance-involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer.

Assuntos

Transformação Celular Neoplásica/metabolismo , Neoplasias/metabolismo , Microambiente Tumoral/fisiologia , Atlas como Assunto , Transformação Celular Neoplásica/patologia , Genômica/métodos , Humanos , Medicina de Precisão/métodos , Análise de Célula Única/métodos

5.

Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity.

Welch, Joshua D; Kozareva, Velina; Ferreira, Ashley; Vanderburg, Charles; Martin, Carly; Macosko, Evan Z.

Cell ; 177(7): 1873-1887.e17, 2019 06 13.

Artigo em Inglês | MEDLINE | ID: mdl-31178122

RESUMO

Defining cell types requires integrating diverse single-cell measurements from multiple experiments and biological contexts. To flexibly model single-cell datasets, we developed LIGER, an algorithm that delineates shared and dataset-specific features of cell identity. We applied it to four diverse and challenging analyses of human and mouse brain cells. First, we defined region-specific and sexually dimorphic gene expression in the mouse bed nucleus of the stria terminalis. Second, we analyzed expression in the human substantia nigra, comparing cell states in specific donors and relating cell types to those in the mouse. Third, we integrated in situ and single-cell expression data to spatially locate fine subtypes of cells present in the mouse frontal cortex. Finally, we jointly defined mouse cortical cell types using single-cell RNA-seq and DNA methylation profiles, revealing putative mechanisms of cell-type-specific epigenomic regulation. Integrative analyses using LIGER promise to accelerate investigations of cell-type definition, gene regulation, and disease states.

Assuntos

Metilação de DNA , Regulação da Expressão Gênica , Núcleos Septais , Análise de Sequência de RNA , Análise de Célula Única , Substância Negra , Adolescente , Adulto , Idoso , Animais , Feminino , Humanos , Masculino , Camundongos , Pessoa de Meia-Idade , Núcleos Septais/citologia , Núcleos Septais/metabolismo , Substância Negra/citologia , Substância Negra/metabolismo

6.

CellVis2: a conference on visualizing the molecular cell.

Autin, Ludovic; Goodsell, David S; Viola, Ivan; Olson, Arthur.

Trends Biochem Sci ; 49(7): 559-563, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38670884

RESUMO

In January 2024, a targeted conference, 'CellVis2', was held at Scripps Research in La Jolla, USA, the second in a series designed to explore the promise, practices, roadblocks, and prospects of creating, visualizing, sharing, and communicating physical representations of entire biological cells at scales down to the atom.

7.

Comparative analysis of integrative classification methods for multi-omics data.

Novoloaca, Alexei; Broc, Camilo; Beloeil, Laurent; Yu, Wen-Han; Becker, Jérémie.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38985929

RESUMO

Recent advances in sequencing, mass spectrometry, and cytometry technologies have enabled researchers to collect multiple 'omics data types from a single sample. These large datasets have led to a growing consensus that a holistic approach is needed to identify new candidate biomarkers and unveil mechanisms underlying disease etiology, a key to precision medicine. While many reviews and benchmarks have been conducted on unsupervised approaches, their supervised counterparts have received less attention in the literature and no gold standard has emerged yet. In this work, we present a thorough comparison of a selection of six methods, representative of the main families of intermediate integrative approaches (matrix factorization, multiple kernel methods, ensemble learning, and graph-based methods). As non-integrative control, random forest was performed on concatenated and separated data types. Methods were evaluated for classification performance on both simulated and real-world datasets, the latter being carefully selected to cover different medical applications (infectious diseases, oncology, and vaccines) and data modalities. A total of 15 simulation scenarios were designed from the real-world datasets to explore a large and realistic parameter space (e.g. sample size, dimensionality, class imbalance, effect size). On real data, the method comparison showed that integrative approaches performed better or equally well than their non-integrative counterpart. By contrast, DIABLO and the four random forest alternatives outperform the others across the majority of simulation scenarios. The strengths and limitations of these methods are discussed in detail as well as guidelines for future applications.

Assuntos

Biologia Computacional , Humanos , Biologia Computacional/métodos , Algoritmos , Genômica/métodos , Genômica/estatística & dados numéricos , Multiômica

8.

DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery.

Lan, Wei; Liao, Haibo; Chen, Qingfeng; Zhu, Lingzhi; Pan, Yi; Chen, Yi-Ping Phoebe.

Brief Bioinform ; 25(3)2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38678587

RESUMO

Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.

Assuntos

Biomarcadores Tumorais , Aprendizado Profundo , Recidiva Local de Neoplasia , Humanos , Biomarcadores Tumorais/metabolismo , Biomarcadores Tumorais/genética , Recidiva Local de Neoplasia/metabolismo , Recidiva Local de Neoplasia/genética , Biologia Computacional/métodos , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/patologia , Genômica/métodos , Multiômica

9.

Deeply integrating latent consistent representations in high-noise multi-omics data for cancer subtyping.

Cai, Yueyi; Wang, Shunfang.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38426322

RESUMO

Cancer is a complex and high-mortality disease regulated by multiple factors. Accurate cancer subtyping is crucial for formulating personalized treatment plans and improving patient survival rates. The underlying mechanisms that drive cancer progression can be comprehensively understood by analyzing multi-omics data. However, the high noise levels in omics data often pose challenges in capturing consistent representations and adequately integrating their information. This paper proposed a novel variational autoencoder-based deep learning model, named Deeply Integrating Latent Consistent Representations (DILCR). Firstly, multiple independent variational autoencoders and contrastive loss functions were designed to separate noise from omics data and capture latent consistent representations. Subsequently, an Attention Deep Integration Network was proposed to integrate consistent representations across different omics levels effectively. Additionally, we introduced the Improved Deep Embedded Clustering algorithm to make integrated variable clustering friendly. The effectiveness of DILCR was evaluated using 10 typical cancer datasets from The Cancer Genome Atlas and compared with 14 state-of-the-art integration methods. The results demonstrated that DILCR effectively captures the consistent representations in omics data and outperforms other integration methods in cancer subtyping. In the Kidney Renal Clear Cell Carcinoma case study, cancer subtypes were identified by DILCR with significant biological significance and interpretability.

Assuntos

Carcinoma de Células Renais , Neoplasias Renais , Neoplasias , Humanos , Multiômica , Neoplasias/genética , Carcinoma de Células Renais/genética , Algoritmos , Análise por Conglomerados , Neoplasias Renais/genética

10.

From G1 to M: a comparative study of methods for identifying cell cycle phases.

Guo, Xinyu; Chen, Liang.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38261342

RESUMO

Accurate identification of cell cycle phases in single-cell RNA-sequencing (scRNA-seq) data is crucial for biomedical research. Many methods have been developed to tackle this challenge, employing diverse approaches to predict cell cycle phases. In this review article, we delve into the standard processes in identifying cell cycle phases within scRNA-seq data and present several representative methods for comparison. To rigorously assess the accuracy of these methods, we propose an error function and employ multiple benchmarking datasets encompassing human and mouse data. Our evaluation results reveal a key finding: the fit between the reference data and the dataset being analyzed profoundly impacts the effectiveness of cell cycle phase identification methods. Therefore, researchers must carefully consider the compatibility between the reference data and their dataset to achieve optimal results. Furthermore, we explore the potential benefits of incorporating benchmarking data with multiple known cell cycle phases into the analysis. Merging such data with the target dataset shows promise in enhancing prediction accuracy. By shedding light on the accuracy and performance of cell cycle phase prediction methods across diverse datasets, this review aims to motivate and guide future methodological advancements. Our findings offer valuable insights for researchers seeking to improve their understanding of cellular dynamics through scRNA-seq analysis, ultimately fostering the development of more robust and widely applicable cell cycle identification methods.

Assuntos

Benchmarking , Pesquisa Biomédica , Humanos , Animais , Camundongos , Ciclo Celular , Pesquisadores

11.

Predicting RNA polymerase II transcriptional elongation pausing and associated histone code.

Ren, Lixin; Ma, Wanbiao; Wang, Yong.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38783706

RESUMO

RNA Polymerase II (Pol II) transcriptional elongation pausing is an integral part of the dynamic regulation of gene transcription in the genome of metazoans. It plays a pivotal role in many vital biological processes and disease progression. However, experimentally measuring genome-wide Pol II pausing is technically challenging and the precise governing mechanism underlying this process is not fully understood. Here, we develop RP3 (RNA Polymerase II Pausing Prediction), a network regularized logistic regression machine learning method, to predict Pol II pausing events by integrating genome sequence, histone modification, gene expression, chromatin accessibility, and protein-protein interaction data. RP3 can accurately predict Pol II pausing in diverse cellular contexts and unveil the transcription factors that are associated with the Pol II pausing machinery. Furthermore, we utilize a forward feature selection framework to systematically identify the combination of histone modification signals associated with Pol II pausing. RP3 is freely available at https://github.com/AMSSwanglab/RP3.

Assuntos

Código das Histonas , RNA Polimerase II , RNA Polimerase II/metabolismo , Humanos , Elongação da Transcrição Genética , Cromatina/metabolismo , Cromatina/genética , Histonas/metabolismo , Aprendizado de Máquina , Animais

12.

Tensor-based insights into systems immunity and infectious disease.

Chin, Jackson L; Chan, Liana C; Yeaman, Michael R; Meyer, Aaron S.

Trends Immunol ; 44(5): 329-332, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-36997459

RESUMO

Profiling immune responses across several dimensions, including time, patients, molecular features, and tissue sites, can deepen our understanding of immunity as an integrated system. These studies require new analytical approaches to realize their full potential. We highlight recent applications of tensor methods and discuss several future opportunities.

Assuntos

Doenças Transmissíveis , Imunidade , Humanos

13.

Multiomics analysis identifies novel facilitators of human dopaminergic neuron differentiation.

Gomez Ramos, Borja; Ohnmacht, Jochen; de Lange, Nikola; Valceschini, Elena; Ginolhac, Aurélien; Catillon, Marie; Ferrante, Daniele; Rakovic, Aleksandar; Halder, Rashi; Massart, François; Arena, Giuseppe; Antony, Paul; Bolognin, Silvia; Klein, Christine; Krause, Roland; Schulz, Marcel H; Sauter, Thomas; Krüger, Rejko; Sinkkonen, Lasse.

EMBO Rep ; 25(1): 254-285, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38177910

RESUMO

Midbrain dopaminergic neurons (mDANs) control voluntary movement, cognition, and reward behavior under physiological conditions and are implicated in human diseases such as Parkinson's disease (PD). Many transcription factors (TFs) controlling human mDAN differentiation during development have been described, but much of the regulatory landscape remains undefined. Using a tyrosine hydroxylase (TH) human iPSC reporter line, we here generate time series transcriptomic and epigenomic profiles of purified mDANs during differentiation. Integrative analysis predicts novel regulators of mDAN differentiation and super-enhancers are used to identify key TFs. We find LBX1, NHLH1 and NR2F1/2 to promote mDAN differentiation and show that overexpression of either LBX1 or NHLH1 can also improve mDAN specification. A more detailed investigation of TF targets reveals that NHLH1 promotes the induction of neuronal miR-124, LBX1 regulates cholesterol biosynthesis, and NR2F1/2 controls neuronal activity.

Assuntos

Neurônios Dopaminérgicos , Células-Tronco Pluripotentes Induzidas , Humanos , Neurônios Dopaminérgicos/metabolismo , Multiômica , Mesencéfalo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Células-Tronco Pluripotentes Induzidas/metabolismo , Diferenciação Celular/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética

14.

Intricacies of single-cell multi-omics data integration.

Rautenstrauch, Pia; Vlot, Anna Hendrika Cornelia; Saran, Sepideh; Ohler, Uwe.

Trends Genet ; 38(2): 128-139, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-34561102

RESUMO

A wealth of single-cell protocols makes it possible to characterize different molecular layers at unprecedented resolution. Integrating the resulting multimodal single-cell data to find cell-to-cell correspondences remains a challenge. We argue that data integration needs to happen at a meaningful biological level of abstraction and that it is necessary to consider the inherent discrepancies between modalities to strike a balance between biological discovery and noise removal. A survey of current methods reveals that a distinction between technical and biological origins of presumed unwanted variation between datasets is not yet commonly considered. The increasing availability of paired multimodal data will aid the development of improved methods by providing a ground truth on cell-to-cell matches.

15.

Comprehensive evaluation and efficient classification of BRCA1 RING domain missense substitutions.

Clark, Kathleen A; Paquette, Andrew; Tao, Kayoko; Bell, Russell; Boyle, Julie L; Rosenthal, Judith; Snow, Angela K; Stark, Alex W; Thompson, Bryony A; Unger, Joshua; Gertz, Jason; Varley, Katherine E; Boucher, Kenneth M; Goldgar, David E; Foulkes, William D; Thomas, Alun; Tavtigian, Sean V.

Am J Hum Genet ; 109(6): 1153-1174, 2022 06 02.

Artigo em Inglês | MEDLINE | ID: mdl-35659930

RESUMO

BRCA1 is a high-risk susceptibility gene for breast and ovarian cancer. Pathogenic protein-truncating variants are scattered across the open reading frame, but all known missense substitutions that are pathogenic because of missense dysfunction are located in either the amino-terminal RING domain or the carboxy-terminal BRCT domain. Heterodimerization of the BRCA1 and BARD1 RING domains is a molecularly defined obligate activity. Hence, we tested every BRCA1 RING domain missense substitution that can be created by a single nucleotide change for heterodimerization with BARD1 in a mammalian two-hybrid assay. Downstream of the laboratory assay, we addressed three additional challenges: assay calibration, validation thereof, and integration of the calibrated results with other available data, such as computational evidence and patient/population observational data to achieve clinically applicable classification. Overall, we found that 15%-20% of BRCA1 RING domain missense substitutions are pathogenic. Using a Bayesian point system for data integration and variant classification, we achieved clinical classification of 89% of observed missense substitutions. Moreover, among missense substitutions not present in the human observational data used here, we find an additional 45 with concordant computational and functional assay evidence in favor of pathogenicity plus 223 with concordant evidence in favor of benignity; these are particularly likely to be classified as likely pathogenic and likely benign, respectively, once human observational data become available.

Assuntos

Neoplasias da Mama , Neoplasias Ovarianas , Animais , Proteína BRCA1/genética , Teorema de Bayes , Neoplasias da Mama/genética , Feminino , Humanos , Mamíferos , Mutação de Sentido Incorreto/genética , Neoplasias Ovarianas/genética , Domínios Proteicos

16.

Simultaneous clustering and estimation of networks in multiple graphical models.

Li, Gen; Wang, Miaoyan.

Biostatistics ; 2024 Jun 05.

Artigo em Inglês | MEDLINE | ID: mdl-38841872

RESUMO

Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.

17.

clipplotr-a comparative visualization and analysis tool for CLIP data.

Chakrabarti, Anob M; Capitanchik, Charlotte; Ule, Jernej; Luscombe, Nicholas M.

RNA ; 29(6): 715-723, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-36894192

RESUMO

CLIP technologies are now widely used to study RNA-protein interactions and many data sets are now publicly available. An important first step in CLIP data exploration is the visual inspection and assessment of processed genomic data on selected genes or regions and performing comparisons: either across conditions within a particular project, or incorporating publicly available data. However, the output files produced by data processing pipelines or preprocessed files available to download from data repositories are often not suitable for direct comparison and usually need further processing. Furthermore, to derive biological insight it is usually necessary to visualize a CLIP signal alongside other data such as annotations, or orthogonal functional genomic data (e.g., RNA-seq). We have developed a simple, but powerful, command-line tool: clipplotr, which facilitates these visual comparative and integrative analyses with normalization and smoothing options for CLIP data and the ability to show these alongside reference annotation tracks and functional genomic data. These data can be supplied as input to clipplotr in a range of file formats, which will output a publication quality figure. It is written in R and can both run on a laptop computer independently or be integrated into computational workflows on a high-performance cluster. Releases, source code, and documentation are freely available at https://github.com/ulelab/clipplotr.

Assuntos

Genômica , Software , Genoma , RNA-Seq

18.

Multimodal deep learning approaches for single-cell multi-omics data integration.

Athaya, Tasbiraha; Ripan, Rony Chowdhury; Li, Xiaoman; Hu, Haiyan.

Brief Bioinform ; 24(5)2023 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-37651607

RESUMO

Integrating single-cell multi-omics data is a challenging task that has led to new insights into complex cellular systems. Various computational methods have been proposed to effectively integrate these rapidly accumulating datasets, including deep learning. However, despite the proven success of deep learning in integrating multi-omics data and its better performance over classical computational methods, there has been no systematic study of its application to single-cell multi-omics data integration. To fill this gap, we conducted a literature review to explore the use of multimodal deep learning techniques in single-cell multi-omics data integration, taking into account recent studies from multiple perspectives. Specifically, we first summarized different modalities found in single-cell multi-omics data. We then reviewed current deep learning techniques for processing multimodal data and categorized deep learning-based integration methods for single-cell multi-omics data according to data modality, deep learning architecture, fusion strategy, key tasks and downstream analysis. Finally, we provided insights into using these deep learning models to integrate multi-omics data and better understand single-cell biological mechanisms.

Assuntos

Aprendizado Profundo , Multiômica

19.

TransIntegrator: capture nearly full protein-coding transcript variants via integrating Illumina and PacBio transcriptomes.

Lin, Zhe; Qin, Yangmei; Chen, Hao; Shi, Dan; Zhong, Mindong; An, Te; Chen, Linshan; Wang, Yiquan; Lin, Fan; Li, Guang; Ji, Zhi-Liang.

Brief Bioinform ; 24(6)2023 09 22.

Artigo em Inglês | MEDLINE | ID: mdl-37779246

RESUMO

Genes have the ability to produce transcript variants that perform specific cellular functions. However, accurately detecting all transcript variants remains a long-standing challenge, especially when working with poorly annotated genomes or without a known genome. To address this issue, we have developed a new computational method, TransIntegrator, which enables transcriptome-wide detection of novel transcript variants. For this, we determined 10 Illumina sequencing transcriptomes and a PacBio full-length transcriptome for consecutive embryo development stages of amphioxus, a species of great evolutionary importance. Based on the transcriptomes, we employed TransIntegrator to create a comprehensive transcript variant library, namely iTranscriptome. The resulting iTrancriptome contained 91 915 distinct transcript variants, with an average of 2.4 variants per gene. This substantially improved current amphioxus genome annotation by expanding the number of genes from 21 954 to 38 777. Further analysis manifested that the gene expansion was largely ascribed to integration of multiple Illumina datasets instead of involving the PacBio data. Moreover, we demonstrated an example application of TransIntegrator, via generating iTrancriptome, in aiding accurate transcriptome assembly, which significantly outperformed other hybrid methods such as IDP-denovo and Trinity. For user convenience, we have deposited the source codes of TransIntegrator on GitHub as well as a conda package in Anaconda. In summary, this study proposes an affordable but efficient method for reliable transcriptomic research in most species.

Assuntos

Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos , Genoma , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos

20.

Identifying phenotype-associated subpopulations through LP_SGL.

Li, Juntao; Zhang, Hongmei; Mu, Bingyu; Zuo, Hongliang; Zhou, Kanglei.

Brief Bioinform ; 25(1)2023 11 22.

Artigo em Inglês | MEDLINE | ID: mdl-38008419

RESUMO

Single-cell RNA sequencing (scRNA-seq) enables the resolution of cellular heterogeneity in diseases and facilitates the identification of novel cell types and subtypes. However, the grouping effects caused by cell-cell interactions are often overlooked in the development of tools for identifying subpopulations. We proposed LP_SGL which incorporates cell group structure to identify phenotype-associated subpopulations by integrating scRNA-seq, bulk expression and bulk phenotype data. Cell groups from scRNA-seq data were obtained by the Leiden algorithm, which facilitates the identification of subpopulations and improves model robustness. LP_SGL identified a higher percentage of cancer cells, T cells and tumor-associated cells than Scissor and scAB on lung adenocarcinoma diagnosis, melanoma drug response and liver cancer survival datasets, respectively. Biological analysis on three original datasets and four independent external validation sets demonstrated that the signaling genes of this cell subset can predict cancer, immunotherapy and survival.

Assuntos

Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Humanos , Algoritmos , Comunicação Celular , Fenótipo , Neoplasias Pulmonares/genética

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA