Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Cell ; 185(18): 3426-3440.e19, 2022 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-36055201

RESUMEN

The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.


Asunto(s)
Genoma Humano , Secuenciación Completa del Genoma , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación INDEL , Masculino , Polimorfismo de Nucleótido Simple
2.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36571484

RESUMEN

MOTIVATION: Understanding comorbidity is essential for disease prevention, treatment and prognosis. In particular, insight into which pairs of diseases are likely or unlikely to co-occur may help elucidate the potential relationships between complex diseases. Here, we introduce the use of an inter-disease interactivity network to discover/prioritize comorbidities. Specifically, we determine disease associations by accounting for the direction of effects of genetic components shared between diseases, and categorize those associations as synergistic or antagonistic. We further develop a comorbidity scoring algorithm to predict whether diseases are more or less likely to co-occur in the presence of a given index disease. This algorithm can handle networks that incorporate relationships with opposite signs. RESULTS: We finally investigate inter-disease associations among 427 phenotypes in UK Biobank PheWAS data and predict the priority of comorbid diseases. The predicted comorbidities were verified using the UK Biobank inpatient electronic health records. Our findings demonstrate that considering the interaction of phenotype associations might be helpful in better predicting comorbidity. AVAILABILITY AND IMPLEMENTATION: The source code and data of this study are available at https://github.com/dokyoonkimlab/DiseaseInteractiveNetwork. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Bancos de Muestras Biológicas , Programas Informáticos , Comorbilidad , Fenotipo
3.
Am J Hum Genet ; 104(1): 55-64, 2019 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-30598166

RESUMEN

Phenome-wide association studies (PheWASs) have been a useful tool for testing associations between genetic variations and multiple complex traits or diagnoses. Linking PheWAS-based associations between phenotypes and a variant or a genomic region into a network provides a new way to investigate cross-phenotype associations, and it might broaden the understanding of genetic architecture that exists between diagnoses, genes, and pleiotropy. We created a network of associations from one of the largest PheWASs on electronic health record (EHR)-derived phenotypes across 38,682 unrelated samples from the Geisinger's biobank; the samples were genotyped through the DiscovEHR project. We computed associations between 632,574 common variants and 541 diagnosis codes. Using these associations, we constructed a "disease-disease" network (DDN) wherein pairs of diseases were connected on the basis of shared associations with a given genetic variant. The DDN provides a landscape of intra-connections within the same disease classes, as well as inter-connections across disease classes. We identified clusters of diseases with known biological connections, such as autoimmune disorders (type 1 diabetes, rheumatoid arthritis, and multiple sclerosis) and cardiovascular disorders. Previously unreported relationships between multiple diseases were identified on the basis of genetic associations as well. The network approach applied in this study can be used to uncover interactions between diseases as a result of their shared, potentially pleiotropic SNPs. Additionally, this approach might advance clinical research and even clinical practice by accelerating our understanding of disease mechanisms on the basis of similar underlying genetic associations.


Asunto(s)
Enfermedad/genética , Registros Electrónicos de Salud , Estudios de Asociación Genética , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Enfermedades Autoinmunes/genética , Enfermedades Cardiovasculares/genética , Epigenómica , Humanos
4.
Hum Genomics ; 15(1): 44, 2021 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-34256850

RESUMEN

BACKGROUND: Previous research in autism and other neurodevelopmental disorders (NDDs) has indicated an important contribution of protein-coding (coding) de novo variants (DNVs) within specific genes. The role of de novo noncoding variation has been observable as a general increase in genetic burden but has yet to be resolved to individual functional elements. In this study, we assessed whole-genome sequencing data in 2671 families with autism (discovery cohort of 516 families, replication cohort of 2155 families). We focused on DNVs in enhancers with characterized in vivo activity in the brain and identified an excess of DNVs in an enhancer named hs737. RESULTS: We adapted the fitDNM statistical model to work in noncoding regions and tested enhancers for excess of DNVs in families with autism. We found only one enhancer (hs737) with nominal significance in the discovery (p = 0.0172), replication (p = 2.5 × 10-3), and combined dataset (p = 1.1 × 10-4). Each individual with a DNV in hs737 had shared phenotypes including being male, intact cognitive function, and hypotonia or motor delay. Our in vitro assessment of the DNVs showed they all reduce enhancer activity in a neuronal cell line. By epigenomic analyses, we found that hs737 is brain-specific and targets the transcription factor gene EBF3 in human fetal brain. EBF3 is genome-wide significant for coding DNVs in NDDs (missense p = 8.12 × 10-35, loss-of-function p = 2.26 × 10-13) and is widely expressed in the body. Through characterization of promoters bound by EBF3 in neuronal cells, we saw enrichment for binding to NDD genes (p = 7.43 × 10-6, OR = 1.87) involved in gene regulation. Individuals with coding DNVs have greater phenotypic severity (hypotonia, ataxia, and delayed development syndrome [HADDS]) in comparison to individuals with noncoding DNVs that have autism and hypotonia. CONCLUSIONS: In this study, we identify DNVs in the hs737 enhancer in individuals with autism. Through multiple approaches, we find hs737 targets the gene EBF3 that is genome-wide significant in NDDs. By assessment of noncoding variation and the genes they affect, we are beginning to understand their impact on gene regulatory networks in NDDs.


Asunto(s)
Trastorno Autístico/genética , Predisposición Genética a la Enfermedad , Hipotonía Muscular/genética , Trastornos del Neurodesarrollo/genética , Factores de Transcripción/genética , Trastorno Autístico/epidemiología , Trastorno Autístico/patología , Elementos de Facilitación Genéticos/genética , Exoma/genética , Femenino , Redes Reguladoras de Genes/genética , Humanos , Masculino , Hipotonía Muscular/epidemiología , Hipotonía Muscular/patología , Mutación/genética , Trastornos del Neurodesarrollo/epidemiología , Trastornos del Neurodesarrollo/patología , Neuronas/metabolismo , Neuronas/patología
5.
Bioinformatics ; 34(3): 527-529, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-28968757

RESUMEN

Motivation: BioBin is an automated bioinformatics tool for the multi-level biological binning of sequence variants. Herein, we present a significant update to BioBin which expands the software to facilitate a comprehensive rare variant analysis and incorporates novel features and analysis enhancements. Results: In BioBin 2.3, we extend our software tool by implementing statistical association testing, updating the binning algorithm, as well as incorporating novel analysis features providing for a robust, highly customizable, and unified rare variant analysis tool. Availability and implementation: The BioBin software package is open source and freely available to users at http://www.ritchielab.com/software/biobin-download. Contact: mdritchie@geisinger.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudios de Asociación Genética/métodos , Variación Genética , Programas Informáticos , Algoritmos , Genómica/métodos
6.
Proc Natl Acad Sci U S A ; 109(43): 17573-8, 2012 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-23045704

RESUMEN

Patients with Down syndrome (trisomy 21, T21) have hematologic abnormalities throughout life. Newborns frequently exhibit abnormal blood counts and a clonal preleukemia. Human T21 fetal livers contain expanded erythro-megakaryocytic precursors with enhanced proliferative capacity. The impact of T21 on the earliest stages of embryonic hematopoiesis is unknown and nearly impossible to examine in human subjects. We modeled T21 yolk sac hematopoiesis using human induced pluripotent stem cells (iPSCs). Blood progenitor populations generated from T21 iPSCs were present at normal frequency and proliferated normally. However, their developmental potential was altered with enhanced erythropoiesis and reduced myelopoiesis, but normal megakaryocyte production. These abnormalities overlap with those of T21 fetal livers, but also reflect important differences. Our studies show that T21 confers distinct developmental stage- and species-specific hematopoietic defects. More generally, we illustrate how iPSCs can provide insight into early stages of normal and pathological human development.


Asunto(s)
Síndrome de Down , Hematopoyesis/genética , Células Madre Pluripotentes/citología , Diferenciación Celular , Perfilación de la Expresión Génica , Humanos , ARN Mensajero/genética , Reacción en Cadena en Tiempo Real de la Polimerasa
7.
J Cell Sci ; 125(Pt 23): 5790-9, 2012 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-22992457

RESUMEN

In moving cells dynamic microtubules (MTs) target and disassemble substrate adhesion sites (focal adhesions; FAs) in a process that enables the cell to detach from the substrate and propel itself forward. The short-range interactions between FAs and MT plus ends have been observed in several experimental systems, but the spatial overlap of these structures within the cell has precluded analysis of the putative long-range mechanisms by which MTs growing through the cell body reach FAs in the periphery of the cell. In the work described here cell geometry was controlled to remove the spatial overlap of cellular structures thus allowing for unambiguous observation of MT guidance. Specifically, micropatterning of living cells was combined with high-resolution in-cell imaging and gene product depletion by means of RNA interference to study the long-range MT guidance in quantitative detail. Cells were confined on adhesive triangular microislands that determined cell shape and ensured that FAs localized exclusively at the vertices of the triangular cells. It is shown that initial MT nucleation at the centrosome is random in direction, while the alignment of MT trajectories with the targets (i.e. FAs at vertices) increases with an increasing distance from the centrosome, indicating that MT growth is a non-random, guided process. The guided MT growth is dependent on the presence of FAs at the vertices. The depletion of either myosin IIA or myosin IIB results in depletion of F-actin bundles and spatially unguided MT growth. Taken together our findings provide quantitative evidence of a role for long-range MT guidance in MT targeting of FAs.


Asunto(s)
Microtúbulos/metabolismo , Actinas/metabolismo , Animales , Línea Celular , Células HeLa , Humanos , Miosina Tipo II/metabolismo , Interferencia de ARN , Ratas
8.
AMIA Jt Summits Transl Sci Proc ; 2023: 487-496, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37350926

RESUMEN

Modeling with longitudinal electronic health record (EHR) data proves challenging given the high dimensionality, redundancy, and noise captured in EHR. In order to improve precision medicine strategies and identify predictors of disease risk in advance, evaluating meaningful patient disease trajectories is essential. In this study, we develop the algorithm DiseasE Trajectory fEature extraCTion (DETECT) for feature extraction and trajectory generation in high-throughput temporal EHR data. This algorithm can 1) simulate longitudinal individual-level EHR data, specified to user parameters of scale, complexity, and noise and 2) use a convergent relative risk framework to test intermediate codes occurring between specified index code(s) and outcome code(s) to determine if they are predictive features of the outcome. Temporal range can be specified to investigate predictors occurring during a specific period of time prior to onset of the outcome. We benchmarked our method on simulated data and generated real-world disease trajectories using DETECT in a cohort of 145,575 individuals diagnosed with hypertension in Penn Medicine EHR for severe cardiometabolic outcomes.

9.
Nat Neurosci ; 26(1): 150-162, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36482247

RESUMEN

Amyotrophic lateral sclerosis (ALS) is a progressively fatal neurodegenerative disease affecting motor neurons in the brain and spinal cord. In this study, we investigated gene expression changes in ALS via RNA sequencing in 380 postmortem samples from cervical, thoracic and lumbar spinal cord segments from 154 individuals with ALS and 49 control individuals. We observed an increase in microglia and astrocyte gene expression, accompanied by a decrease in oligodendrocyte gene expression. By creating a gene co-expression network in the ALS samples, we identified several activated microglia modules that negatively correlate with retrospective disease duration. We mapped molecular quantitative trait loci and found several potential ALS risk loci that may act through gene expression or splicing in the spinal cord and assign putative cell types for FNBP1, ACSL5, SH3RF1 and NFASC. Finally, we outline how common genetic variants associated with splicing of C9orf72 act as proxies for the well-known repeat expansion, and we use the same mechanism to suggest ATXN3 as a putative risk gene.


Asunto(s)
Esclerosis Amiotrófica Lateral , Enfermedades Neurodegenerativas , Humanos , Esclerosis Amiotrófica Lateral/genética , Esclerosis Amiotrófica Lateral/metabolismo , Enfermedades Neurodegenerativas/metabolismo , Estudios Retrospectivos , Transcriptoma , Médula Espinal/metabolismo
10.
iScience ; 25(2): 103760, 2022 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-35036860

RESUMEN

Impressive global efforts have identified both rare and common gene variants associated with severe COVID-19 using sequencing technologies. However, these studies lack the sensitivity to accurately detect several classes of variants, especially large structural variants (SVs), which account for a substantial proportion of genetic diversity including clinically relevant variation. We performed optical genome mapping on 52 severely ill COVID-19 patients to identify rare/unique SVs as decisive predisposition factors associated with COVID-19. We identified 7 SVs involving genes implicated in two key host-viral interaction pathways: innate immunity and inflammatory response, and viral replication and spread in nine patients, of which SVs in STK26 and DPP4 genes are the most intriguing candidates. This study is the first to systematically assess the potential role of SVs in the pathogenesis of COVID-19 severity and highlights the need to evaluate SVs along with sequencing variants to comprehensively associate genomic information with interindividual variability in COVID-19 phenotypes.

11.
Cell Genom ; 2(5)2022 May.
Artículo en Inglés | MEDLINE | ID: mdl-36452119

RESUMEN

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development.

12.
Nat Genet ; 53(8): 1125-1134, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34312540

RESUMEN

Autism is a highly heritable complex disorder in which de novo mutation (DNM) variation contributes significantly to risk. Using whole-genome sequencing data from 3,474 families, we investigate another source of large-effect risk variation, ultra-rare variants. We report and replicate a transmission disequilibrium of private, likely gene-disruptive (LGD) variants in probands but find that 95% of this burden resides outside of known DNM-enriched genes. This variant class more strongly affects multiplex family probands and supports a multi-hit model for autism. Candidate genes with private LGD variants preferentially transmitted to probands converge on the E3 ubiquitin-protein ligase complex, intracellular transport and Erb signaling protein networks. We estimate that these variants are approximately 2.5 generations old and significantly younger than other variants of similar type and frequency in siblings. Overall, private LGD variants are under strong purifying selection and appear to act on a distinct set of genes not yet associated with autism.


Asunto(s)
Trastorno del Espectro Autista/genética , Predisposición Genética a la Enfermedad , Proteínas/genética , Trastorno Autístico/genética , Evolución Molecular , Dosificación de Gen , Haplotipos , Humanos , Desequilibrio de Ligamiento , Modelos Genéticos , Mutación , Linaje , Polimorfismo de Nucleótido Simple , Mapas de Interacción de Proteínas/genética , Hermanos , Secuenciación Completa del Genoma
13.
Nat Biotechnol ; 39(9): 1129-1140, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34504351

RESUMEN

Assessing the reproducibility, accuracy and utility of massively parallel DNA sequencing platforms remains an ongoing challenge. Here the Association of Biomolecular Resource Facilities (ABRF) Next-Generation Sequencing Study benchmarks the performance of a set of sequencing instruments (HiSeq/NovaSeq/paired-end 2 × 250-bp chemistry, Ion S5/Proton, PacBio circular consensus sequencing (CCS), Oxford Nanopore Technologies PromethION/MinION, BGISEQ-500/MGISEQ-2000 and GS111) on human and bacterial reference DNA samples. Among short-read instruments, HiSeq 4000 and X10 provided the most consistent, highest genome coverage, while BGI/MGISEQ provided the lowest sequencing error rates. The long-read instrument PacBio CCS had the highest reference-based mapping rate and lowest non-mapping rate. The two long-read platforms PacBio CCS and PromethION/MinION showed the best sequence mapping in repeat-rich areas and across homopolymers. NovaSeq 6000 using 2 × 250-bp read chemistry was the most robust instrument for capturing known insertion/deletion events. This study serves as a benchmark for current genomics technologies, as well as a resource to inform experimental design and next-generation sequencing variant calling.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas , Disparidad de Par Base , Benchmarking , ADN/genética , ADN Bacteriano/genética , Genoma Bacteriano , Genoma Humano , Humanos
14.
Science ; 372(6537)2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33632895

RESUMEN

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Asunto(s)
Variación Genética , Genoma Humano , Haplotipos , Femenino , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Secuencias Repetitivas Esparcidas , Masculino , Grupos de Población/genética , Sitios de Carácter Cuantitativo , Retroelementos , Análisis de Secuencia de ADN , Inversión de Secuencia , Secuenciación Completa del Genoma
15.
Pac Symp Biocomput ; 22: 177-183, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-27896973

RESUMEN

Given the exponential growth of biomedical data, researchers are faced with numerous challenges in extracting and interpreting information from these large, high-dimensional, incomplete, and often noisy data. To facilitate addressing this growing concern, the "Patterns in Biomedical Data-How do we find them?" session of the 2017 Pacific Symposium on Biocomputing (PSB) is devoted to exploring pattern recognition using data-driven approaches for biomedical and precision medicine applications. The papers selected for this session focus on novel machine learning techniques as well as applications of established methods to heterogeneous data. We also feature manuscripts aimed at addressing the current challenges associated with the analysis of biomedical data.

17.
J Clin Invest ; 125(3): 993-1005, 2015 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-25621499

RESUMEN

Germline GATA1 mutations that result in the production of an amino-truncated protein termed GATA1s (where s indicates short) cause congenital hypoplastic anemia. In patients with trisomy 21, similar somatic GATA1s-producing mutations promote transient myeloproliferative disease and acute megakaryoblastic leukemia. Here, we demonstrate that induced pluripotent stem cells (iPSCs) from patients with GATA1-truncating mutations exhibit impaired erythroid potential, but enhanced megakaryopoiesis and myelopoiesis, recapitulating the major phenotypes of the associated diseases. Similarly, in developmentally arrested GATA1-deficient murine megakaryocyte-erythroid progenitors derived from murine embryonic stem cells (ESCs), expression of GATA1s promoted megakaryopoiesis, but not erythropoiesis. Transcriptome analysis revealed a selective deficiency in the ability of GATA1s to activate erythroid-specific genes within populations of hematopoietic progenitors. Although its DNA-binding domain was intact, chromatin immunoprecipitation studies showed that GATA1s binding at specific erythroid regulatory regions was impaired, while binding at many nonerythroid sites, including megakaryocytic and myeloid target genes, was normal. Together, these observations indicate that lineage-specific GATA1 cofactor associations are essential for normal chromatin occupancy and provide mechanistic insights into how GATA1s mutations cause human disease. More broadly, our studies underscore the value of ESCs and iPSCs to recapitulate and study disease phenotypes.


Asunto(s)
Factor de Transcripción GATA1/fisiología , Células Madre Pluripotentes Inducidas/metabolismo , Animales , Células Cultivadas , Cromatina/metabolismo , Células Madre Embrionarias/metabolismo , Epigénesis Genética , Células Eritroides , Eritropoyesis , Humanos , Ratones , Mutación , Estructura Terciaria de Proteína , Análisis de la Célula Individual , Transcriptoma
18.
Biomaterials ; 33(20): 5004-12, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22494889

RESUMEN

Dynamic cell-microenvironment interactions regulate many biological events and play a critical role in tissue regeneration. Cell homing to targeted tissues requires well balanced interactions between cells and adhesion molecules on blood vessel walls. However, many stem cells lack affinity with adhesion molecules. It is challenging and clinically important to engineer these stem cells to modulate their dynamic interactions with blood vessels. In this study, a new chemical strategy was developed to engineer cell-microenvironment interactions. This method allowed the conjugation of peptides onto stem cell membranes without affecting cell viability, proliferation or multipotency. Mesenchymal stem cells (MSCs) engineered in this manner showed controlled firm adhesion and rolling on E-selectin under physiological shear stresses. For the first time, these biomechanical responses were achieved by tuning the binding kinetics of the peptide-selectin interaction. Rolling of engineered MSCs on E-selectin is mediated by a Ca(2+) independent interaction, a mechanism that differs from the Ca(2+) dependent physiological process. This further illustrates the ability of this approach to manipulate cell-microenvironment interactions, in particular for the application of delivering cells to targeted tissues. It also provides a new platform to engineer cells with multiple functionalities.


Asunto(s)
Membrana Celular/metabolismo , Péptidos/metabolismo , Selectinas/metabolismo , Células Madre/citología , Diferenciación Celular , Proliferación Celular , Humanos , Cinética , Microfluídica , Ingeniería de Tejidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA