Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
GigaByte ; 2024: gigabyte118, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38746537

RESUMEN

Marsupials exhibit distinctive modes of reproduction and early development that set them apart from their eutherian counterparts and render them invaluable for comparative studies. However, marsupial genomic resources still lag far behind those of eutherian mammals. We present a series of novel genomic resources for the fat-tailed dunnart (Sminthopsis crassicaudata), a mouse-like marsupial that, due to its ease of husbandry and ex-utero development, is emerging as a laboratory model. We constructed a highly representative multi-tissue de novo transcriptome assembly of dunnart RNA-seq reads spanning 12 tissues. The transcriptome includes 2,093,982 assembled transcripts and has a mammalian transcriptome BUSCO completeness score of 93.3%, the highest amongst currently published marsupial transcriptomes. This global transcriptome, along with ab initio predictions, supported annotation of the existing dunnart genome, revealing 21,622 protein-coding genes. Altogether, these resources will enable wider use of the dunnart as a model marsupial and deepen our understanding of mammalian genome evolution.

2.
Med Image Anal ; 96: 103192, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38810516

RESUMEN

Methods to detect malignant lesions from screening mammograms are usually trained with fully annotated datasets, where images are labelled with the localisation and classification of cancerous lesions. However, real-world screening mammogram datasets commonly have a subset that is fully annotated and another subset that is weakly annotated with just the global classification (i.e., without lesion localisation). Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it. The first option will reduce detection accuracy because it does not use the whole dataset, and the second option is too expensive given that the annotation needs to be done by expert radiologists. In this paper, we propose a middle-ground solution for the dilemma, which is to formulate the training as a weakly- and semi-supervised learning problem that we refer to as malignant breast lesion detection with incomplete annotations. To address this problem, our new method comprises two stages, namely: (1) pre-training a multi-view mammogram classifier with weak supervision from the whole dataset, and (2) extending the trained classifier to become a multi-view detector that is trained with semi-supervised student-teacher learning, where the training set contains fully and weakly-annotated mammograms. We provide extensive detection results on two real-world screening mammogram datasets containing incomplete annotations and show that our proposed approach achieves state-of-the-art results in the detection of malignant breast lesions with incomplete annotations.


Asunto(s)
Neoplasias de la Mama , Mamografía , Interpretación de Imagen Radiográfica Asistida por Computador , Humanos , Neoplasias de la Mama/diagnóstico por imagen , Mamografía/métodos , Femenino , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Algoritmos , Aprendizaje Automático Supervisado
4.
Genome Biol ; 25(1): 94, 2024 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-38622708

RESUMEN

Recent innovations in single-cell RNA-sequencing (scRNA-seq) provide the technology to investigate biological questions at cellular resolution. Pooling cells from multiple individuals has become a common strategy, and droplets can subsequently be assigned to a specific individual by leveraging their inherent genetic differences. An implicit challenge with scRNA-seq is the occurrence of doublets-droplets containing two or more cells. We develop Demuxafy, a framework to enhance donor assignment and doublet removal through the consensus intersection of multiple demultiplexing and doublet detecting methods. Demuxafy significantly improves droplet assignment by separating singlets from doublets and classifying the correct individual.


Asunto(s)
Análisis de la Célula Individual , Humanos , Análisis de la Célula Individual/métodos , Análisis de Secuencia de ARN/métodos
5.
Nat Genet ; 56(4): 595-604, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38548990

RESUMEN

Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis. Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA sequencing of lung tissue from 66 individuals with pulmonary fibrosis and 48 unaffected donors. Using a pseudobulk approach, we mapped expression quantitative trait loci (eQTLs) across 38 cell types, observing both shared and cell-type-specific regulatory effects. Furthermore, we identified disease interaction eQTLs and demonstrated that this class of associations is more likely to be cell-type-specific and linked to cellular dysregulation in pulmonary fibrosis. Finally, we connected lung disease risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression and implicates context-specific eQTLs as key regulators of lung homeostasis and disease.


Asunto(s)
Fibrosis Pulmonar , Sitios de Carácter Cuantitativo , Humanos , Sitios de Carácter Cuantitativo/genética , Fibrosis Pulmonar/genética , Regulación de la Expresión Génica/genética , Pulmón , Herencia Multifactorial , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple
6.
Genome Biol ; 25(1): 56, 2024 02 26.
Artículo en Inglés | MEDLINE | ID: mdl-38409056

RESUMEN

BACKGROUND: The development of single-cell RNA sequencing (scRNA-seq) has enabled scientists to catalog and probe the transcriptional heterogeneity of individual cells in unprecedented detail. A common step in the analysis of scRNA-seq data is the selection of so-called marker genes, most commonly to enable annotation of the biological cell types present in the sample. In this paper, we benchmark 59 computational methods for selecting marker genes in scRNA-seq data. RESULTS: We compare the performance of the methods using 14 real scRNA-seq datasets and over 170 additional simulated datasets. Methods are compared on their ability to recover simulated and expert-annotated marker genes, the predictive performance and characteristics of the gene sets they select, their memory usage and speed, and their implementation quality. In addition, various case studies are used to scrutinize the most commonly used methods, highlighting issues and inconsistencies. CONCLUSIONS: Overall, we present a comprehensive evaluation of methods for selecting marker genes in scRNA-seq data. Our results highlight the efficacy of simple methods, especially the Wilcoxon rank-sum test, Student's t-test, and logistic regression.


Asunto(s)
Benchmarking , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Secuenciación del Exoma , Análisis de Secuencia de ARN , Perfilación de la Expresión Génica , Programas Informáticos
7.
Cell Genom ; 3(8): 100349, 2023 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-37601968

RESUMEN

Meiotic crossovers are required for accurate chromosome segregation and producing new allelic combinations. Meiotic crossover numbers are tightly regulated within a narrow range, despite an excess of initiating DNA double-strand breaks. Here, we reveal the tumor suppressor FANCM as a meiotic anti-crossover factor in mammals. We use unique large-scale crossover analyses with both single-gamete sequencing and pedigree-based bulk-sequencing datasets to identify a genome-wide increase in crossover frequencies in Fancm-deficient mice. Gametogenesis is heavily perturbed in Fancm loss-of-function mice, which is consistent with the reproductive defects reported in humans with biallelic FANCM mutations. A portion of the gametogenesis defects can be attributed to the cGAS-STING pathway after birth. Despite the gametogenesis phenotypes in Fancm mutants, both sexes are capable of producing offspring. We propose that the anti-crossover function and role in gametogenesis of Fancm are separable and will inform diagnostic pathways for human genomic instability disorders.

8.
Radiol Artif Intell ; 5(2): e220072, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37035431

RESUMEN

Supplemental material is available for this article. Keywords: Mammography, Screening, Convolutional Neural Network (CNN) Published under a CC BY 4.0 license. See also the commentary by Cadrin-Chênevert in this issue.

9.
bioRxiv ; 2023 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-36993211

RESUMEN

Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis (PF). Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA-sequencing of lung tissue from 67 PF and 49 unaffected donors. Employing a pseudo-bulk approach, we mapped expression quantitative trait loci (eQTL) across 38 cell types, observing both shared and cell type-specific regulatory effects. Further, we identified disease-interaction eQTL and demonstrated that this class of associations is more likely to be cell-type specific and linked to cellular dysregulation in PF. Finally, we connected PF risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression, and implicates context-specific eQTL as key regulators of lung homeostasis and disease.

10.
bioRxiv ; 2023 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-38168317

RESUMEN

The human lung is structurally complex, with a diversity of specialized epithelial, stromal and immune cells playing specific functional roles in anatomically distinct locations, and large-scale changes in the structure and cellular makeup of this distal lung is a hallmark of pulmonary fibrosis (PF) and other progressive chronic lung diseases. Single-cell transcriptomic studies have revealed numerous disease-emergent/enriched cell types/states in PF lungs, but the spatial contexts wherein these cells contribute to disease pathogenesis has remained uncertain. Using sub-cellular resolution image-based spatial transcriptomics, we analyzed the gene expression of more than 1 million cells from 19 unique lungs. Through complementary cell-based and innovative cell-agnostic analyses, we characterized the localization of PF-emergent cell-types, established the cellular and molecular basis of classical PF histopathologic disease features, and identified a diversity of distinct molecularly-defined spatial niches in control and PF lungs. Using machine-learning and trajectory analysis methods to segment and rank airspaces on a gradient from normal to most severely remodeled, we identified a sequence of compositional and molecular changes that associate with progressive distal lung pathology, beginning with alveolar epithelial dysregulation and culminating with changes in macrophage polarization. Together, these results provide a unique, spatially-resolved characterization of the cellular and molecular programs of PF and control lungs, provide new insights into the heterogeneous pathobiology of PF, and establish analytical approaches which should be broadly applicable to other imaging-based spatial transcriptomic studies.

11.
BMC Bioinformatics ; 23(1): 460, 2022 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-36329399

RESUMEN

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) technology has contributed significantly to diverse research areas in biology, from cancer to development. Since scRNA-seq data is high-dimensional, a common strategy is to learn low-dimensional latent representations better to understand overall structure in the data. In this work, we build upon scVI, a powerful deep generative model which can learn biologically meaningful latent representations, but which has limited explicit control of batch effects. Rather than prioritizing batch effect removal over conservation of biological variation, or vice versa, our goal is to provide a bird's eye view of the trade-offs between these two conflicting objectives. Specifically, using the well established concept of Pareto front from economics and engineering, we seek to learn the entire trade-off curve between conservation of biological variation and removal of batch effects. RESULTS: A multi-objective optimisation technique known as Pareto multi-task learning (Pareto MTL) is used to obtain the Pareto front between conservation of biological variation and batch effect removal. Our results indicate Pareto MTL can obtain a better Pareto front than the naive scalarization approach typically encountered in the literature. In addition, we propose to measure batch effect by applying a neural-network based estimator called Mutual Information Neural Estimation (MINE) and show benefits over the more standard maximum mean discrepancy measure. CONCLUSION: The Pareto front between conservation of biological variation and batch effect removal is a valuable tool for researchers in computational biology. Our results demonstrate the efficacy of applying Pareto MTL to estimate the Pareto front in conjunction with applying MINE to measure the batch effect.


Asunto(s)
Algoritmos , Transcriptoma , Biología Computacional/métodos , Análisis de la Célula Individual
12.
PLoS One ; 17(9): e0275168, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36173986

RESUMEN

We developed a simple and reliable method for the isolation of haploid nuclei from fresh and frozen testes. The described protocol uses readily available reagents in combination with flow cytometry to separate haploid and diploid nuclei. The protocol can be completed within 1 hour and the resulting individual haploid nuclei have intact morphology. The isolated nuclei are suitable for library preparation for high-throughput DNA and RNA sequencing using bulk or single nuclei. The protocol was optimised with mouse testes and we anticipate that it can be applied for the isolation of mature sperm from other mammals including humans.


Asunto(s)
Ácidos Nucleicos , Animales , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Mamíferos , Ratones , Semen , Espermatozoides
13.
Nucleic Acids Res ; 50(20): e118, 2022 11 11.
Artículo en Inglés | MEDLINE | ID: mdl-36107768

RESUMEN

Profiling gametes of an individual enables the construction of personalised haplotypes and meiotic crossover landscapes, now achievable at larger scale than ever through the availability of high-throughput single-cell sequencing technologies. However, high-throughput single-gamete data commonly have low depth of coverage per gamete, which challenges existing gamete-based haplotype phasing methods. In addition, haplotyping a large number of single gametes from high-throughput single-cell DNA sequencing data and constructing meiotic crossover profiles using existing methods requires intensive processing. Here, we introduce efficient software tools for the essential tasks of generating personalised haplotypes and calling crossovers in gametes from single-gamete DNA sequencing data (sgcocaller), and constructing, visualising, and comparing individualised crossover landscapes from single gametes (comapr). With additional data pre-possessing, the tools can also be applied to bulk-sequenced samples. We demonstrate that sgcocaller is able to generate impeccable phasing results for high-coverage datasets, on which it is more accurate and stable than existing methods, and also performs well on low-coverage single-gamete sequencing datasets for which current methods fail. Our tools achieve highly accurate results with user-friendly installation, comprehensive documentation, efficient computation times and minimal memory usage.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , Algoritmos , Células Germinativas , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Análisis de Expresión Génica de una Sola Célula , Programas Informáticos , Intercambio Genético
14.
Genome Biol ; 22(1): 341, 2021 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-34911537

RESUMEN

Population-scale single-cell RNA sequencing (scRNA-seq) is now viable, enabling finer resolution functional genomics studies and leading to a rush to adapt bulk methods and develop new single-cell-specific methods to perform these studies. Simulations are useful for developing, testing, and benchmarking methods but current scRNA-seq simulation frameworks do not simulate population-scale data with genetic effects. Here, we present splatPop, a model for flexible, reproducible, and well-documented simulation of population-scale scRNA-seq data with known expression quantitative trait loci. splatPop can also simulate complex batch, cell group, and conditional effects between individuals from different cohorts as well as genetically-driven co-expression.


Asunto(s)
Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Benchmarking , Análisis por Conglomerados , Simulación por Computador , Perfilación de la Expresión Génica/métodos , Genómica , Humanos , Sitios de Carácter Cuantitativo , Programas Informáticos
15.
Genome Biol ; 22(1): 188, 2021 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-34167583

RESUMEN

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) has enabled the unbiased, high-throughput quantification of gene expression specific to cell types and states. With the cost of scRNA-seq decreasing and techniques for sample multiplexing improving, population-scale scRNA-seq, and thus single-cell expression quantitative trait locus (sc-eQTL) mapping, is increasingly feasible. Mapping of sc-eQTL provides additional resolution to study the regulatory role of common genetic variants on gene expression across a plethora of cell types and states and promises to improve our understanding of genetic regulation across tissues in both health and disease. RESULTS: While previously established methods for bulk eQTL mapping can, in principle, be applied to sc-eQTL mapping, there are a number of open questions about how best to process scRNA-seq data and adapt bulk methods to optimize sc-eQTL mapping. Here, we evaluate the role of different normalization and aggregation strategies, covariate adjustment techniques, and multiple testing correction methods to establish best practice guidelines. We use both real and simulated datasets across single-cell technologies to systematically assess the impact of these different statistical approaches. CONCLUSION: We provide recommendations for future single-cell eQTL studies that can yield up to twice as many eQTL discoveries as default approaches ported from bulk studies.


Asunto(s)
Mapeo Cromosómico/estadística & datos numéricos , Genoma Humano , Células Madre Pluripotentes Inducidas/metabolismo , Sitios de Carácter Cuantitativo , Análisis de la Célula Individual/métodos , Alelos , Línea Celular , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Células Madre Pluripotentes Inducidas/citología , Análisis de Secuencia de ARN , Programas Informáticos , Secuenciación del Exoma
16.
Genome Biol ; 22(1): 112, 2021 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-33874978

RESUMEN

Genetic maps have been fundamental to building our understanding of disease genetics and evolutionary processes. The gametes of an individual contain all of the information required to perform a de novo chromosome-scale assembly of an individual's genome, which historically has been performed with populations and pedigrees. Here, we discuss how single-cell gamete sequencing offers the potential to merge the advantages of short-read sequencing with the ability to build personalized genetic maps and open up an entirely new space in personalized genetics.


Asunto(s)
Genoma , Genómica/métodos , Células Germinativas/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Medicina de Precisión/métodos , Análisis de la Célula Individual/métodos , Animales , Mapeo Cromosómico , Biología Computacional/métodos , Biología Computacional/normas , Interpretación Estadística de Datos , Heterogeneidad Genética , Genómica/normas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Medicina de Precisión/normas , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Análisis de la Célula Individual/normas , Secuenciación Completa del Genoma
17.
Nat Biotechnol ; 38(6): 747-755, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32518403

RESUMEN

Single-cell RNA sequencing (scRNA-seq) is the leading technique for characterizing the transcriptomes of individual cells in a sample. The latest protocols are scalable to thousands of cells and are being used to compile cell atlases of tissues, organs and organisms. However, the protocols differ substantially with respect to their RNA capture efficiency, bias, scale and costs, and their relative advantages for different applications are unclear. In the present study, we generated benchmark datasets to systematically evaluate protocols in terms of their power to comprehensively describe cell types and states. We performed a multicenter study comparing 13 commonly used scRNA-seq and single-nucleus RNA-seq protocols applied to a heterogeneous reference sample resource. Comparative analysis revealed marked differences in protocol performance. The protocols differed in library complexity and their ability to detect cell-type markers, impacting their predictive value and suitability for integration into reference cell atlases. These results provide guidance both for individual researchers and for consortium projects such as the Human Cell Atlas.


Asunto(s)
Análisis de Secuencia de ARN , Análisis de la Célula Individual , Animales , Benchmarking , Línea Celular , Bases de Datos Genéticas , Genómica/métodos , Genómica/normas , Humanos , Ratones , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Análisis de la Célula Individual/métodos , Análisis de la Célula Individual/normas
19.
Nat Methods ; 17(4): 414-421, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32203388

RESUMEN

Bulk and single-cell DNA sequencing has enabled reconstructing clonal substructures of somatic tissues from frequency and cooccurrence patterns of somatic variants. However, approaches to characterize phenotypic variations between clones are not established. Here we present cardelino (https://github.com/single-cell-genetics/cardelino), a computational method for inferring the clonal tree configuration and the clone of origin of individual cells assayed using single-cell RNA-seq (scRNA-seq). Cardelino flexibly integrates information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. We apply cardelino to a published cancer dataset and to newly generated matched scRNA-seq and exome-seq data from 32 human dermal fibroblast lines, identifying hundreds of differentially expressed genes between cells from different somatic clones. These genes are frequently enriched for cell cycle and proliferation pathways, indicating a role for cell division genes in somatic evolution in healthy skin.


Asunto(s)
Fibroblastos/metabolismo , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Programas Informáticos , Algoritmos , Ciclo Celular , Proliferación Celular , Humanos , Melanoma , Mutación , Transcriptoma
20.
Genome Biol ; 21(1): 31, 2020 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-32033589

RESUMEN

The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.


Asunto(s)
Ciencia de los Datos/métodos , Genómica/métodos , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Animales , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...