ABSTRACT
Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE1 and RefSeq2 launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins. Here, we describe the MANE transcript sets for use as universal standards for variant reporting and browser display. The MANE Select set identifies a representative transcript for each human protein-coding gene, whereas the MANE Plus Clinical set provides additional transcripts at loci where the Select transcripts alone are not sufficient to report all currently known clinical variants. Each MANE transcript represents an exact match between the exonic sequences of an Ensembl/GENCODE transcript and its counterpart in RefSeq such that the identifiers can be used synonymously. We have now released MANE Select transcripts for 97% of human protein-coding genes, including all American College of Medical Genetics and Genomics Secondary Findings list v3.0 (ref. 3) genes. MANE transcripts are accessible from major genome browsers and key resources. Widespread adoption of these transcript sets will increase the consistency of reporting, facilitate the exchange of data regardless of the annotation source and help to streamline clinical interpretation.
Subject(s)
Computational Biology , Databases, Genetic , Genomics , Genome , Humans , Information Dissemination , Molecular Sequence Annotation , National Library of Medicine (U.S.) , United StatesABSTRACT
The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.
Subject(s)
Gene Expression Profiling , RNA-Seq , Humans , Animals , Mice , RNA-Seq/methods , Gene Expression Profiling/methods , Transcriptome , Sequence Analysis, RNA/methods , Molecular Sequence Annotation/methodsABSTRACT
Ensembl (https://www.ensembl.org) is a freely available genomic resource that has produced high-quality annotations, tools, and services for vertebrates and model organisms for more than two decades. In recent years, there has been a dramatic shift in the genomic landscape, with a large increase in the number and phylogenetic breadth of high-quality reference genomes, alongside major advances in the pan-genome representations of higher species. In order to support these efforts and accelerate downstream research, Ensembl continues to focus on scaling for the rapid annotation of new genome assemblies, developing new methods for comparative analysis, and expanding the depth and quality of our genome annotations. This year we have continued our expansion to support global biodiversity research, doubling the number of annotated genomes we support on our Rapid Release site to over 1700, driven by our close collaboration with biodiversity projects such as Darwin Tree of Life. We have also strengthened support for key agricultural species, including the first regulatory builds for farmed animals, and have updated key tools and resources that support the global scientific community, notably the Ensembl Variant Effect Predictor. Ensembl data, software, and tools are freely available.
Subject(s)
Databases, Genetic , Genomics , Animals , Genome , Molecular Sequence Annotation , Phylogeny , Software , HumansABSTRACT
GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
Subject(s)
Computational Biology , Genome, Human , Humans , Animals , Mice , Molecular Sequence Annotation , Computational Biology/methods , Genome, Human/genetics , Transcriptome/genetics , Gene Expression Profiling , Databases, GeneticABSTRACT
Ensembl (https://www.ensembl.org) has produced high-quality genomic resources for vertebrates and model organisms for more than twenty years. During that time, our resources, services and tools have continually evolved in line with both the publicly available genome data and the downstream research and applications that utilise the Ensembl platform. In recent years we have witnessed a dramatic shift in the genomic landscape. There has been a large increase in the number of high-quality reference genomes through global biodiversity initiatives. In parallel, there have been major advances towards pangenome representations of higher species, where many alternative genome assemblies representing different breeds, cultivars, strains and haplotypes are now available. In order to support these efforts and accelerate downstream research, it is our goal at Ensembl to create high-quality annotations, tools and services for species across the tree of life. Here, we report our resources for popular reference genomes, the dramatic growth of our annotations (including haplotypes from the first human pangenome graphs), updates to the Ensembl Variant Effect Predictor (VEP), interactive protein structure predictions from AlphaFold DB, and the beta release of our new website.
Subject(s)
Databases, Genetic , Software , Animals , Humans , Molecular Sequence Annotation , Genomics , GenomeABSTRACT
Evolution by natural selection is an explicitly genetic theory. Darwin recognized that a working theory of inheritance was central to his theory and spent much of his scientific life seeking one. The seeds of his attempt to fill this gap, his "provisional hypothesis" of pangenesis, appear in his notebooks when he was first formulating his evolutionary ideas. Darwin, in short, desperately needed Mendel. In this paper, we set Mendel's work in the context of experimental biology and animal/plant breeding of the period and review both the well-known story of possible contact between Mendel and Darwin and the actual contact between their ideas after their deaths. Mendel's contributions to evolutionary biology were fortuitous. Regardless, it is Mendel's work that completed Darwin's theory. The modern theory based on the marriage between Mendel's and Darwin's ideas as forged most comprehensively by R. A. Fisher is both Darwin's achievement and Mendel's.
Subject(s)
Biological Evolution , Breeding , Genetics , Selection, Genetic , Animals , Breeding/history , Genetics/history , History, 19th Century , Inheritance Patterns , Plants/genetics , Probability , SeedsABSTRACT
OBJECTIVE: To uncover the values and preferences of the caregivers for children with medical complexity (CMC), using the test case of surgical treatment decision-making for pediatric neuromuscular scoliosis (NMS) that will inform the future development of a decision support tool in this population. STUDY DESIGN: We conducted a qualitative study of semi-structured interviews of English- and Spanish-speaking caregivers of children with NMS from two geographically distinct children's hospitals. We used purposive sampling of language and treatment options selected to capture diverse experiences. Analysis was based on grounded theory with synthesized caregiver values and preferences themes. RESULTS: From 47 participants, we completed 41 interviews (9 in Spanish). Caregivers had a mean age of 43.2 years, were mostly White (66%), and had children with a mean age of 15.6. 64% chose surgery. The following values and preferences were important to many caregivers: reducing scoliosis-related pain, minimizing mobility limitations to optimize socio-emotional quality of life, limiting the impact of comorbidities on overall quality of life, information provided by peer support, the uncertainty of outcomes due to underlying comorbidities, and the uncertainty related to the anticipated progression of their child's scoliosis curve. Caregivers experienced immense uncertainty related to treatment outcomes due to their child's comorbidities. CONCLUSIONS: Caregivers of CMC may benefit from decision support that includes both values clarification exercises to help caregivers identify what of the many possible values and preferences are important to them and novel methods to communicate uncertainty in the care of CMC.
ABSTRACT
We have made the compound 2O-BaPtO3 by high-pressure, high-temperature synthesis, determined its structure, and tested its catalytic activity. Compounds of the same stoichiometry have been reported and tentatively identified as hexagonal perovskites, and although no structural model was ever established, 2O-BaPtO3 is clearly different and, to the best of our knowledge, unique. It features continuous chains of face-sharing PtO6 octahedra, like the well-known 2H hexagonal perovskite type, but with a staggered offset between the chains that breaks hexagonal symmetry and disrupts the close-packed array of A = Ba and X = O that is a defining characteristic of ABX3 perovskites. We investigated this structure and its stability vs the conventional 2H form using X-ray and neutron diffraction, X-ray absorption spectroscopy, and ab initio calculations. Catalytic testing of 2O-BaPtO3 showed that it is active for hydrogen evolution.
Subject(s)
Biological Evolution , Biology , Expeditions , Fires , Biology/history , Expeditions/history , History, 19th Century , Fires/historyABSTRACT
Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annotation at scale for all eukaryotic life, and it also provides deep comprehensive annotation for key species. Genomes representing a greater diversity of species are increasingly being sequenced. In response, we have focussed our recent efforts on expediting the annotation of new assemblies. Here, we report the release of the greatest annual number of newly annotated genomes in the history of Ensembl via our dedicated Ensembl Rapid Release platform (http://rapid.ensembl.org). We have also developed a new method to generate comparative analyses at scale for these assemblies and, for the first time, we have annotated non-vertebrate eukaryotes. Meanwhile, we continually improve, extend and update the annotation for our high-value reference vertebrate genomes and report the details here. We have a range of specific software tools for specific tasks, such as the Ensembl Variant Effect Predictor (VEP) and the newly developed interface for the Variant Recoder. All Ensembl data, software and tools are freely available for download and are accessible programmatically.
Subject(s)
Databases, Genetic , Genome/genetics , Molecular Sequence Annotation , Software , Animals , Computational Biology/classification , HumansABSTRACT
The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
Subject(s)
COVID-19/prevention & control , Computational Biology/methods , Databases, Genetic , Genomics/methods , Molecular Sequence Annotation/methods , SARS-CoV-2/genetics , Animals , COVID-19/epidemiology , COVID-19/virology , Epidemics , Humans , Internet , Mice , Pseudogenes/genetics , RNA, Long Noncoding/genetics , SARS-CoV-2/metabolism , SARS-CoV-2/physiology , Transcription, Genetic/geneticsABSTRACT
INTRODUCTION: Over the last decade, there has been a 32% decrease in independent plastic surgery fellowships. The growing prevalence of 6-year integrated plastic surgery residencies, duty hour restrictions, and new subspecialty training fellowships for general surgeons have changed the training experience of plastic surgery fellows. METHODS: A retrospective review of the Accreditation Council for Graduate Medical Education (ACGME) case logs for graduating fellows of independent plastic surgery fellowships in the United States was conducted from 2011 to 2019. A linear regression analysis was conducted for each case log code and category, and a 95% level of confidence was assumed (α = 0.05). RESULTS: In 2011, 141 residents from 69 programs graduated with an average of 1469.7 cases. In 2019, 84 residents from 47 programs graduated with an average of 1952 cases. Index procedures significantly increased overall during the 9 y (P < 0.001). Categorical cases increased in esthetics (P < 0.001), including facelift, browlift, blepharoplasty, and more. Categorical cases increased in reconstructive surgery (P < 0.001), including treatment of deformities of the skin, lower extremities, and trunk, nerve decompression, and hand reconstruction. In breast procedures, an increase was seen in the reduction of mammoplasty, reconstruction, and treatment of other breast deformities. In head and neck procedures, an increase was seen in resection of head and neck neoplasms and secondary cleft lip repair. Decreases in procedural numbers were seen in primary cleft lip repair and hand reconstruction by primary closure. CONCLUSIONS: Despite a 32% decline in the number of independent plastic surgery fellowships over the last 9 y, plastic surgery fellows are obtaining significantly more surgical experience, both in esthetic and reconstructive surgery.
Subject(s)
Cleft Lip , General Surgery , Internship and Residency , Mammaplasty , Surgery, Plastic , Accreditation , Clinical Competence , Education, Medical, Graduate/methods , Fellowships and Scholarships , General Surgery/education , Humans , Surgery, Plastic/education , United StatesABSTRACT
Androgen biosynthesis in the human fetus proceeds through the adrenal sex steroid precursor dehydroepiandrosterone, which is converted to testosterone in the gonads, followed by further activation to 5α-dihydrotestosterone in genital skin, thereby facilitating male external genital differentiation. Congenital adrenal hyperplasia due to P450 oxidoreductase deficiency results in disrupted dehydroepiandrosterone biosynthesis, explaining undervirilization in affected boys. However, many affected girls are born virilized, despite low circulating androgens. We hypothesized that this is due to a prenatally active, alternative androgen biosynthesis pathway from 17α-hydroxyprogesterone to 5α-dihydrotestosterone, which bypasses dehydroepiandrosterone and testosterone, with increased activity in congenital adrenal hyperplasia variants associated with 17α-hydroxyprogesterone accumulation. Here we employ explant cultures of human fetal organs (adrenals, gonads, genital skin) from the major period of sexual differentiation and show that alternative pathway androgen biosynthesis is active in the fetus, as assessed by liquid chromatography-tandem mass spectrometry. We found androgen receptor expression in male and female genital skin using immunohistochemistry and demonstrated that both 5α-dihydrotestosterone and adrenal explant culture supernatant induce nuclear translocation of the androgen receptor in female genital skin primary cultures. Analyzing urinary steroid excretion by gas chromatography-mass spectrometry, we show that neonates with P450 oxidoreductase deficiency produce androgens through the alternative androgen pathway during the first weeks of life. We provide quantitative in vitro evidence that the corresponding P450 oxidoreductase mutations predominantly support alternative pathway androgen biosynthesis. These results indicate a key role of alternative pathway androgen biosynthesis in the prenatal virilization of girls affected by congenital adrenal hyperplasia due to P450 oxidoreductase deficiency.
Subject(s)
17-alpha-Hydroxyprogesterone/metabolism , Androgens/biosynthesis , Antley-Bixler Syndrome Phenotype/genetics , Fetus/metabolism , Receptors, Androgen/genetics , Virilism/metabolism , Adrenal Glands/embryology , Adrenal Glands/metabolism , Androgens/genetics , Cells, Cultured , Female , Fetus/embryology , Genitalia/embryology , Genitalia/metabolism , Gonads/embryology , Gonads/metabolism , Humans , Male , Receptors, Androgen/metabolism , Sex Differentiation , Virilism/geneticsABSTRACT
We explored the degree to which political bias in medicine and study authors could explain the stark variation in Hydroxychloroquine (HCQ)/Chloroquine (CQ) study favorability in the US compared to the rest of the world. COVID-19/SARS-CoV-2 preprint and published papers between January 1, 2020-July 26, 2020 with Hydroxychloroquine and/or Chloroquine; 267 met study criteria, 68 from the US. A control subset was selected. HCQ/CQ study result favorability (favorable, unfavorable, or neutral) was noted. First and last main authors of each US study were entered into FollowTheMoney.org Website, extracting any history of political party donation. Of all US studies (68 total), 39/68 (57.4%) were unfavorable, with only 7/68 (10.3%) of US studies yielding favorable results-compared to 199 non-US studies, 66/199 (33.2%) unfavorable, 69/199 (34.7%) favorable, and 64/199 (32.2%) neutral. Studies with at least one US main author were 20.4% (SE 0.053, P < 0.05) more likely to report unfavorable results than non-US studies. US Studies with at least one main author donating to any political party were 25.6% (SE 0.085, P < 0.01) more likely to have unfavorable results. US studies with at least one author donating to the Democratic party were 20.4% (SE 0.045, P < 0.05) more likely to have unfavorable results. US authors were more likely to publish studies with medically harmful conclusions than non-US authors. Cardiology-specific HCQ/CQ studies were 44.2% more likely to yield harmful conclusions (P < 0.01). Inaccurate propagation of HCQ/CQ cardiac adverse effects with individual scientific author political bias has contributed to unfavorable US HCQ/CQ publication patterns and political polarization of the medications.
Subject(s)
Antimalarials/therapeutic use , COVID-19 Drug Treatment , Gift Giving , Hydroxychloroquine/therapeutic use , Politics , Publication Bias , Humans , United StatesABSTRACT
The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.
Subject(s)
Databases, Genetic , Genome, Human/genetics , Genomics , Pseudogenes/genetics , Animals , Computational Biology , Humans , Internet , Mice , Molecular Sequence Annotation , SoftwareABSTRACT
New South Wales has recently added the capability of extracorporeal membrane oxygenation to the neonatal and paediatric retrieval process and this paper describes the early experiences and protocol development for the first eight cases transported.
Subject(s)
Extracorporeal Membrane Oxygenation , Australia , Child , Humans , Infant, Newborn , New South Wales , Retrospective StudiesABSTRACT
The SARS-CoV-2 virus spreading across the world has led to surges of COVID-19 illness, hospitalizations, and death. The complex and multifaceted pathophysiology of life-threatening COVID-19 illness including viral mediated organ damage, cytokine storm, and thrombosis warrants early interventions to address all components of the devastating illness. In countries where therapeutic nihilism is prevalent, patients endure escalating symptoms and without early treatment can succumb to delayed in-hospital care and death. Prompt early initiation of sequenced multidrug therapy (SMDT) is a widely and currently available solution to stem the tide of hospitalizations and death. A multipronged therapeutic approach includes 1) adjuvant nutraceuticals, 2) combination intracellular anti-infective therapy, 3) inhaled/oral corticosteroids, 4) antiplatelet agents/anticoagulants, 5) supportive care including supplemental oxygen, monitoring, and telemedicine. Randomized trials of individual, novel oral therapies have not delivered tools for physicians to combat the pandemic in practice. No single therapeutic option thus far has been entirely effective and therefore a combination is required at this time. An urgent immediate pivot from single drug to SMDT regimens should be employed as a critical strategy to deal with the large numbers of acute COVID-19 patients with the aim of reducing the intensity and duration of symptoms and avoiding hospitalization and death.
Subject(s)
COVID-19 Drug Treatment , Leprostatic Agents/therapeutic use , Pandemics , SARS-CoV-2 , Telemedicine/methods , COVID-19/epidemiology , Drug Therapy, Combination , HumansABSTRACT
Reducing CO2 emissions is a key task of modern society to attenuate climate change and its environmental effects. Accelerated weathering of limestone (AWL) has been proposed as a tool to capture CO2 from effluent gas streams and store it primarily as bicarbonate in the marine environment. We evaluated the performance of the biggest AWL-reactor to date that was installed at a coal-fired power plant in Germany. Depending on the gas flow rate, approximately 55% of the CO2 could be removed from the flue gas. The generated product water was characterized by an up to 5-fold increase in alkalinity, which indicates the successful weathering of limestone and the long-term storage of the captured CO2. A rise of potentially harmful substances in the product water (NO2-, NOx-, NH4+, SO42-, and heavy metals) or in unreacted limestone particles (heavy metals) to levels of environmental concern could not be observed, most likely as a result of a desulfurization of the flue gas before it entered the AWL reactor. At locations where limestone and water availability is high, AWL could be used for a safe and long-term storage of CO2.
Subject(s)
Air Pollutants , Carbon Dioxide , Calcium Carbonate , Carbon , Coal , Germany , Power PlantsABSTRACT
The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.
Subject(s)
Consensus Sequence , Databases, Genetic , Open Reading Frames , Animals , Data Curation/methods , Data Curation/standards , Databases, Genetic/standards , Guidelines as Topic , Humans , Mice , Molecular Sequence Annotation , National Library of Medicine (U.S.) , United States , User-Computer InterfaceABSTRACT
The Earth's transition zone, at depths of 410-660 km, while being composed of nominally anhydrous magnesium silicate minerals, may be subject to significant hydration. Little is known about the mechanism of hydration, despite the vital role this plays in the physical and chemical properties of the mantle, leading to a need for improved structural characterization. Here we present an ab initio random structure searching (AIRSS) investigation of semihydrous (1.65 wt % H2O) and fully hydrous (3.3 wt % H2O) wadsleyite. Following the AIRSS process, k-means clustering was used to select sets of structures with duplicates removed, which were then subjected to further geometry optimization with tighter constraints prior to NMR calculations. Semihydrous models identify a ground-state structure (Mg3 vacancies, O1-H hydroxyls) that aligns with a number of previous experimental observations. However, predicted NMR parameters fail to reproduce low-intensity signals observed in solid-state NMR spectra. In contrast, the fully hydrous models produced by AIRSS, which enable both isolated and clustered defects, are able to explain observed NMR signals via just four low-enthalpy structures: (i) a ground state, with isolated Mg3 vacancies and O1-H hydroxyls; (ii/iii) edge-sharing Mg3 vacancies with O1-H and O3-H species; and (iv) edge-sharing Mg1 and Mg3 vacancies with O1-H, O3-H, and O4-H hydroxyls. Thus, the combination of advanced structure searching approaches and solid-state NMR spectroscopy is able to provide new and detailed insight into the structure of this important mantle mineral.