RESUMEN
The genetic architecture of brain structure and function is largely unknown. To investigate this, we carried out genome-wide association studies of 3,144 functional and structural brain imaging phenotypes from UK Biobank (discovery dataset 8,428 subjects). Here we show that many of these phenotypes are heritable. We identify 148 clusters of associations between single nucleotide polymorphisms and imaging phenotypes that replicate at P < 0.05, when we would expect 21 to replicate by chance. Notable significant, interpretable associations include: iron transport and storage genes, related to magnetic susceptibility of subcortical brain tissue; extracellular matrix and epidermal growth factor genes, associated with white matter micro-structure and lesions; genes that regulate mid-line axon development, associated with organization of the pontine crossing tract; and overall 17 genes involved in development, pathway signalling and plasticity. Our results provide insights into the genetic architecture of the brain that are relevant to neurological and psychiatric disorders, brain development and ageing.
Asunto(s)
Bancos de Muestras Biológicas , Encéfalo/diagnóstico por imagen , Estudio de Asociación del Genoma Completo , Herencia , Neuroimagen , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Envejecimiento/genética , Encéfalo/anatomía & histología , Encéfalo/crecimiento & desarrollo , Encéfalo/patología , Conjuntos de Datos como Asunto , Factor de Crecimiento Epidérmico/genética , Matriz Extracelular , Femenino , Humanos , Hierro/metabolismo , Masculino , Plasticidad Neuronal/genética , Putamen/anatomía & histología , Putamen/metabolismo , Transducción de Señal/genética , Reino Unido , Sustancia Blanca/anatomía & histología , Sustancia Blanca/metabolismo , Sustancia Blanca/patologíaRESUMEN
OBJECTIVES: To describe the epidemiology of sepsis in critical care by applying the Sepsis-3 criteria to electronic health records. DESIGN: Retrospective cohort study using electronic health records. SETTING: Ten ICUs from four U.K. National Health Service hospital trusts contributing to the National Institute for Health Research Critical Care Health Informatics Collaborative. PATIENTS: A total of 28,456 critical care admissions (14,332 emergency medical, 4,585 emergency surgical, and 9,539 elective surgical). MEASUREMENTS AND MAIN RESULTS: Twenty-nine thousand three hundred forty-three episodes of clinical deterioration were identified with a rise in Sequential Organ Failure Assessment score of at least 2 points, of which 14,869 (50.7%) were associated with antibiotic escalation and thereby met the Sepsis-3 criteria for sepsis. A total of 4,100 episodes of sepsis (27.6%) were associated with vasopressor use and lactate greater than 2.0 mmol/L, and therefore met the Sepsis-3 criteria for septic shock. ICU mortality by source of sepsis was highest for ICU-acquired sepsis (23.7%; 95% CI, 21.9-25.6%), followed by hospital-acquired sepsis (18.6%; 95% CI, 17.5-19.9%), and community-acquired sepsis (12.9%; 95% CI, 12.1-13.6%) (p for comparison less than 0.0001). CONCLUSIONS: We successfully operationalized the Sepsis-3 criteria to an electronic health record dataset to describe the characteristics of critical care patients with sepsis. This may facilitate sepsis research using electronic health record data at scale without relying on human coding.
Asunto(s)
Cuidados Críticos/estadística & datos numéricos , Infección Hospitalaria/mortalidad , Puntuaciones en la Disfunción de Órganos , Sepsis/mortalidad , Sepsis/terapia , Índice de Severidad de la Enfermedad , Adulto , Anciano , Estudios de Cohortes , Infección Hospitalaria/terapia , Femenino , Humanos , Unidades de Cuidados Intensivos , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , Choque Séptico/mortalidad , Medicina EstatalRESUMEN
We built a reference panel with 342 million autosomal variants using 78,195 individuals from the Genomics England (GEL) dataset, achieving a phasing switch error rate of 0.18% for European samples and imputation quality of r2 = 0.75 for variants with minor allele frequencies as low as 2 × 10-4 in white British samples. The GEL-imputed UK Biobank genome-wide association analysis identified 70% of associations found by direct exome sequencing (P < 2.18 × 10-11), while extending testing of rare variants to the entire genome. Coding variants dominated the rare-variant genome-wide association results, implying less disruptive effects of rare non-coding variants.
Asunto(s)
Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Haplotipos , Polimorfismo de Nucleótido Simple , Humanos , Inglaterra , Secuenciación del Exoma/métodos , Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Biobanco del Reino Unido , Reino Unido , Población Blanca/genéticaRESUMEN
Inexpensive genotyping methods are essential to modern genomics. Here we present QUILT, which performs diploid genotype imputation using low-coverage whole-genome sequence data. QUILT employs Gibbs sampling to partition reads into maternal and paternal sets, facilitating rapid haploid imputation using large reference panels. We show this partitioning to be accurate over many megabases, enabling highly accurate imputation close to theoretical limits and outperforming existing methods. Moreover, QUILT can impute accurately using diverse technologies, including long reads from Oxford Nanopore Technologies, and a new form of low-cost barcoded Illumina sequencing called haplotagging, with the latter showing improved accuracy at low coverages. Relative to DNA genotyping microarrays, QUILT offers improved accuracy at reduced cost, particularly for diverse populations that are traditionally underserved in modern genomic analyses, with accuracy nearly doubling at rare SNPs. Finally, QUILT can accurately impute (four-digit) human leukocyte antigen types, the first such method from low-coverage sequence data.
Asunto(s)
Biología Computacional/métodos , Genotipo , Técnicas de Genotipaje , Secuenciación Completa del Genoma , Biología Computacional/economía , Diploidia , Humanos , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Análisis de Secuencia de ADNRESUMEN
Knowledge of genome-wide genealogies for thousands of individuals would simplify most evolutionary analyses for humans and other species, but has remained computationally infeasible. We have developed a method, Relate, scaling to >10,000 sequences while simultaneously estimating branch lengths, mutational ages and variable historical population sizes, as well as allowing for data errors. Application to 1,000 Genomes Project haplotypes produces joint genealogical histories for 26 human populations. Highly diverged lineages are present in all groups, but most frequent in Africa. Outside Africa, these mainly reflect ancient introgression from groups related to Neanderthals and Denisovans, while African signals instead reflect unknown events unique to that continent. Our approach allows more powerful inferences of natural selection than has previously been possible. We identify multiple regions under strong positive selection, and multi-allelic traits including hair color, body mass index and blood pressure, showing strong evidence of directional selection, varying among human groups.
Asunto(s)
Evolución Molecular , Genética de Población , Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Linaje , Selección Genética , Animales , Haplotipos , Humanos , Mutación , Hombre de Neandertal , Polimorfismo de Nucleótido Simple , Densidad de PoblaciónRESUMEN
OBJECTIVE: To build and curate a linkable multi-centre database of high resolution longitudinal electronic health records (EHR) from adult Intensive Care Units (ICU). To develop a set of open-source tools to make these data 'research ready' while protecting patient's privacy with a particular focus on anonymisation. MATERIALS AND METHODS: We developed a scalable EHR processing pipeline for extracting, linking, normalising and curating and anonymising EHR data. Patient and public involvement was sought from the outset, and approval to hold these data was granted by the NHS Health Research Authority's Confidentiality Advisory Group (CAG). The data are held in a certified Data Safe Haven. We followed sustainable software development principles throughout, and defined and populated a common data model that links to other clinical areas. RESULTS: Longitudinal EHR data were loaded into the CCHIC database from eleven adult ICUs at 5 UK teaching hospitals. From January 2014 to January 2017, this amounted to 21,930 and admissions (18,074 unique patients). Typical admissions have 70 data-items pertaining to admission and discharge, and a median of 1030 (IQR 481-2335) time-varying measures. Training datasets were made available through virtual machine images emulating the data processing environment. An open source R package, cleanEHR, was developed and released that transforms the data into a square table readily analysable by most statistical packages. A simple language agnostic configuration file will allow the user to select and clean variables, and impute missing data. An audit trail makes clear the provenance of the data at all times. DISCUSSION: Making health care data available for research is problematic. CCHIC is a unique multi-centre longitudinal and linkable resource that prioritises patient privacy through the highest standards of data security, but also provides tools to clean, organise, and anonymise the data. We believe the development of such tools are essential if we are to meet the twin requirements of respecting patient privacy and working for patient benefit. CONCLUSION: The CCHIC database is now in use by health care researchers from academia and industry. The 'research ready' suite of data preparation tools have facilitated access, and linkage to national databases of secondary care is underway.