Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Nat Rev Genet ; 24(8): 550-572, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37002403

RESUMEN

Recent advances in single-cell technologies have enabled high-throughput molecular profiling of cells across modalities and locations. Single-cell transcriptomics data can now be complemented by chromatin accessibility, surface protein expression, adaptive immune receptor repertoire profiling and spatial information. The increasing availability of single-cell data across modalities has motivated the development of novel computational methods to help analysts derive biological insights. As the field grows, it becomes increasingly difficult to navigate the vast landscape of tools and analysis steps. Here, we summarize independent benchmarking studies of unimodal and multimodal single-cell analysis across modalities to suggest comprehensive best-practice workflows for the most common analysis steps. Where independent benchmarks are not available, we review and contrast popular methods. Our article serves as an entry point for novices in the field of single-cell (multi-)omic analysis and guides advanced users to the most recent best practices.


Asunto(s)
Perfilación de la Expresión Génica , Proteómica , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos
2.
Bioinformatics ; 39(4)2023 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-37004171

RESUMEN

MOTIVATION: Machine learning has shown extensive growth in recent years and is now routinely applied to sensitive areas. To allow appropriate verification of predictive models before deployment, models must be deterministic. Solely fixing all random seeds is not sufficient for deterministic machine learning, as major machine learning libraries default to the usage of nondeterministic algorithms based on atomic operations. RESULTS: Various machine learning libraries released deterministic counterparts to the nondeterministic algorithms. We evaluated the effect of these algorithms on determinism and runtime. Based on these results, we formulated a set of requirements for deterministic machine learning and developed a new software solution, the mlf-core ecosystem, which aids machine learning projects to meet and keep these requirements. We applied mlf-core to develop deterministic models in various biomedical fields including a single-cell autoencoder with TensorFlow, a PyTorch-based U-Net model for liver-tumor segmentation in computed tomography scans, and a liver cancer classifier based on gene expression profiles with XGBoost. AVAILABILITY AND IMPLEMENTATION: The complete data together with the implementations of the mlf-core ecosystem and use case models are available at https://github.com/mlf-core.


Asunto(s)
Ecosistema , Programas Informáticos , Aprendizaje Automático , Algoritmos , Tomografía Computarizada por Rayos X
4.
J Proteome Res ; 18(11): 3876-3884, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31589052

RESUMEN

Personalized multipeptide vaccines are currently being discussed intensively for tumor immunotherapy. In order to identify epitopes-short, immunogenic peptides-suitable for eliciting a tumor-specific immune response, human leukocyte antigen-presented peptides are isolated by immunoaffinity purification from cancer tissue samples and analyzed by liquid chromatography-coupled tandem mass spectrometry (LC-MS/MS). Here, we present MHCquant, a fully automated, portable computational pipeline able to process LC-MS/MS data automatically and generate annotated, false discovery rate-controlled lists of (neo-)epitopes with associated relative quantification information. We could show that MHCquant achieves higher sensitivity than established methods. While obtaining the highest number of unique peptides, the rate of predicted MHC binders remains still comparable to other tools. Reprocessing of the data from a previously published study resulted in the identification of several neoepitopes not detected by previously applied methods. MHCquant integrates tailor-made pipeline components with existing open-source software into a coherent processing workflow. Container-based virtualization permits execution of this workflow without complex software installation, execution on cluster/cloud infrastructures, and full reproducibility of the results. Integration with the data analysis workbench KNIME enables easy mining of large-scale immunopeptidomics data sets. MHCquant is available as open-source software along with accompanying documentation on our website at https://www.openms.de/mhcquant/ .


Asunto(s)
Biología Computacional/métodos , Análisis de Datos , Péptidos/metabolismo , Proteómica/métodos , Cromatografía Liquida/métodos , Antígenos HLA/inmunología , Humanos , Internet , Mutación , Péptidos/genética , Péptidos/inmunología , Reproducibilidad de los Resultados , Programas Informáticos , Espectrometría de Masas en Tándem/métodos
6.
Genome Biol ; 25(1): 181, 2024 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-38978088

RESUMEN

Single-cell multiomic analysis of the epigenome, transcriptome, and proteome allows for comprehensive characterization of the molecular circuitry that underpins cell identity and state. However, the holistic interpretation of such datasets presents a challenge given a paucity of approaches for systematic, joint evaluation of different modalities. Here, we present Panpipes, a set of computational workflows designed to automate multimodal single-cell and spatial transcriptomic analyses by incorporating widely-used Python-based tools to perform quality control, preprocessing, integration, clustering, and reference mapping at scale. Panpipes allows reliable and customizable analysis and evaluation of individual and integrated modalities, thereby empowering decision-making before downstream investigations.


Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Transcriptoma , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Humanos , Flujo de Trabajo
7.
Genome Biol ; 25(1): 109, 2024 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-38671451

RESUMEN

Single-cell multiplexing techniques (cell hashing and genetic multiplexing) combine multiple samples, optimizing sample processing and reducing costs. Cell hashing conjugates antibody-tags or chemical-oligonucleotides to cell membranes, while genetic multiplexing allows to mix genetically diverse samples and relies on aggregation of RNA reads at known genomic coordinates. We develop hadge (hashing deconvolution combined with genotype information), a Nextflow pipeline that combines 12 methods to perform both hashing- and genotype-based deconvolution. We propose a joint deconvolution strategy combining best-performing methods and demonstrate how this approach leads to the recovery of previously discarded cells in a nuclei hashing of fresh-frozen brain tissue.


Asunto(s)
Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Humanos , Encéfalo/metabolismo , Encéfalo/citología , Programas Informáticos , Genotipo
8.
Nat Med ; 2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39266748

RESUMEN

With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for comprehensive exploratory analysis that accounts for data heterogeneity is missing. Here we introduce ehrapy, a modular open-source Python framework designed for exploratory analysis of heterogeneous epidemiology and EHR data. ehrapy incorporates a series of analytical steps, from data extraction and quality control to the generation of low-dimensional representations. Complemented by rich statistical modules, ehrapy facilitates associating patients with disease states, differential comparison between patient clusters, survival analysis, trajectory inference, causal inference and more. Leveraging ontologies, ehrapy further enables data sharing and training EHR deep learning models, paving the way for foundational models in biomedical research. We demonstrate ehrapy's features in six distinct examples. We applied ehrapy to stratify patients affected by unspecified pneumonia into finer-grained phenotypes. Furthermore, we reveal biomarkers for significant differences in survival among these groups. Additionally, we quantify medication-class effects of pneumonia medications on length of stay. We further leveraged ehrapy to analyze cardiovascular risks across different data modalities. We reconstructed disease state trajectories in patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) based on imaging data. Finally, we conducted a case study to demonstrate how ehrapy can detect and mitigate biases in EHR data. ehrapy, thus, provides a framework that we envision will standardize analysis pipelines on EHR data and serve as a cornerstone for the community.

9.
Sci Transl Med ; 15(725): eadh0908, 2023 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-38055803

RESUMEN

Pulmonary fibrosis develops as a consequence of failed regeneration after injury. Analyzing mechanisms of regeneration and fibrogenesis directly in human tissue has been hampered by the lack of organotypic models and analytical techniques. In this work, we coupled ex vivo cytokine and drug perturbations of human precision-cut lung slices (hPCLS) with single-cell RNA sequencing and induced a multilineage circuit of fibrogenic cell states in hPCLS. We showed that these cell states were highly similar to the in vivo cell circuit in a multicohort lung cell atlas from patients with pulmonary fibrosis. Using micro-CT-staged patient tissues, we characterized the appearance and interaction of myofibroblasts, an ectopic endothelial cell state, and basaloid epithelial cells in the thickened alveolar septum of early-stage lung fibrosis. Induction of these states in the hPCLS model provided evidence that the basaloid cell state was derived from alveolar type 2 cells, whereas the ectopic endothelial cell state emerged from capillary cell plasticity. Cell-cell communication routes in patients were largely conserved in hPCLS, and antifibrotic drug treatments showed highly cell type-specific effects. Our work provides an experimental framework for perturbational single-cell genomics directly in human lung tissue that enables analysis of tissue homeostasis, regeneration, and pathology. We further demonstrate that hPCLS offer an avenue for scalable, high-resolution drug testing to accelerate antifibrotic drug development and translation.


Asunto(s)
Fibrosis Pulmonar , Humanos , Fibrosis Pulmonar/genética , Fibrosis Pulmonar/patología , Análisis de Expresión Génica de una Sola Célula , Pulmón/patología , Células Epiteliales Alveolares , Células Epiteliales/metabolismo
10.
Nat Med ; 29(6): 1563-1577, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37291214

RESUMEN

Single-cell technologies have transformed our understanding of human tissues. Yet, studies typically capture only a limited number of donors and disagree on cell type definitions. Integrating many single-cell datasets can address these limitations of individual studies and capture the variability present in the population. Here we present the integrated Human Lung Cell Atlas (HLCA), combining 49 datasets of the human respiratory system into a single atlas spanning over 2.4 million cells from 486 individuals. The HLCA presents a consensus cell type re-annotation with matching marker genes, including annotations of rare and previously undescribed cell types. Leveraging the number and diversity of individuals in the HLCA, we identify gene modules that are associated with demographic covariates such as age, sex and body mass index, as well as gene modules changing expression along the proximal-to-distal axis of the bronchial tree. Mapping new data to the HLCA enables rapid data annotation and interpretation. Using the HLCA as a reference for the study of disease, we identify shared cell states across multiple lung diseases, including SPP1+ profibrotic monocyte-derived macrophages in COVID-19, pulmonary fibrosis and lung carcinoma. Overall, the HLCA serves as an example for the development and use of large-scale, cross-dataset organ atlases within the Human Cell Atlas.


Asunto(s)
COVID-19 , Neoplasias Pulmonares , Fibrosis Pulmonar , Humanos , Pulmón , Neoplasias Pulmonares/genética , Macrófagos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA