RESUMO
Gene duplication is a major evolutionary force driving adaptation and speciation, as it allows for the acquisition of new functions and can augment or diversify existing functions. Here, we report a gene duplication event that yielded another outcome--the generation of antagonistic functions. One product of this duplication event--UPF3B--is critical for the nonsense-mediated RNA decay (NMD) pathway, while its autosomal counterpart--UPF3A--encodes an enigmatic protein previously shown to have trace NMD activity. Using loss-of-function approaches in vitro and in vivo, we discovered that UPF3A acts primarily as a potent NMD inhibitor that stabilizes hundreds of transcripts. Evidence suggests that UPF3A acquired repressor activity through simple impairment of a critical domain, a rapid mechanism that may have been widely used in evolution. Mice conditionally lacking UPF3A exhibit "hyper" NMD and display defects in embryogenesis and gametogenesis. Our results support a model in which UPF3A serves as a molecular rheostat that directs developmental events.
Assuntos
Desenvolvimento Embrionário , Genes Duplicados , Degradação do RNAm Mediada por Códon sem Sentido , Proteínas de Ligação a RNA/metabolismo , Animais , Linhagem Celular Tumoral , Evolução Molecular , Gametogênese , Células HeLa , Humanos , CamundongosRESUMO
We describe an update of MirGeneDB, the manually curated microRNA gene database. Adhering to uniform and consistent criteria for microRNA annotation and nomenclature, we substantially expanded MirGeneDB with 30 additional species representing previously missing metazoan phyla such as sponges, jellyfish, rotifers and flatworms. MirGeneDB 2.1 now consists of 75 species spanning over â¼800 million years of animal evolution, and contains a total number of 16 670 microRNAs from 1549 families. Over 6000 microRNAs were added in this update using â¼550 datasets with â¼7.5 billion sequencing reads. By adding new phylogenetically important species, especially those relevant for the study of whole genome duplication events, and through updating evolutionary nodes of origin for many families and genes, we were able to substantially refine our nomenclature system. All changes are traceable in the specifically developed MirGeneDB version tracker. The performance of read-pages is improved and microRNA expression matrices for all tissues and species are now also downloadable. Altogether, this update represents a significant step toward a complete sampling of all major metazoan phyla, and a widely needed foundation for comparative microRNA genomics and transcriptomics studies. MirGeneDB 2.1 is part of RNAcentral and Elixir Norway, publicly and freely available at http://www.mirgenedb.org/.
Assuntos
Biologia Computacional , Bases de Dados Genéticas , Evolução Molecular , Genômica , Animais , Humanos , MicroRNAs/classificação , MicroRNAs/genética , FilogeniaRESUMO
Whole-genome duplications (WGDs) have long been considered the causal mechanism underlying dramatic increases to morphological complexity due to the neo-functionalization of paralogs generated during these events. Nonetheless, an alternative hypothesis suggests that behind the retention of most paralogs is not neo-functionalization, but instead the degree of the inter-connectivity of the intended gene product, as well as the mode of the WGD itself. Here, we explore both the causes and consequences of WGD by examining the distribution, expression, and molecular evolution of microRNAs (miRNAs) in both gnathostome vertebrates as well as chelicerate arthropods. We find that although the number of miRNA paralogs tracks the number of WGDs experienced within the lineage, few of these paralogs experienced changes to the seed sequence, and thus are functionally equivalent relative to their mRNA targets. Nonetheless, in gnathostomes, although the retention of paralogs following the 1R autotetraploidization event is similar across the two subgenomes, the paralogs generated by the gnathostome 2R allotetraploidization event are retained in higher numbers on one subgenome relative to the second, with the miRNAs found on the preferred subgenome showing both higher expression of mature miRNA transcripts and slower molecular evolution of the precursor miRNA sequences. Importantly, WGDs do not result in the creation of miRNA novelty, nor do WGDs correlate to increases in complexity. Instead, it is the number of miRNA seed sequences in the genome itself that not only better correlate to instances in complexification, but also mechanistically explain why complexity increases when new miRNA families are established.
Assuntos
Duplicação Gênica , Genoma , MicroRNAs , Animais , Evolução Molecular , MicroRNAs/genética , FilogeniaRESUMO
The evolution of specialized cell-types is a long-standing interest of biologists, but given the deep time-scales very difficult to reconstruct or observe. microRNAs have been linked to the evolution of cellular complexity and may inform on specialization. The endothelium is a vertebrate-specific specialization of the circulatory system that enabled a critical new level of vasoregulation. The evolutionary origin of these endothelial cells is unclear. We hypothesized that Mir-126, an endothelial cell-specific microRNA may be informative. We here reconstruct the evolutionary history of Mir-126. Mir-126 likely appeared in the last common ancestor of vertebrates and tunicates, which was a species without an endothelium, within an intron of the evolutionary much older EGF Like Domain Multiple (Egfl) locus. Mir-126 has a complex evolutionary history due to duplications and losses of both the host gene and the microRNA. Taking advantage of the strong evolutionary conservation of the microRNA among Olfactores, and using RNA in situ hybridization, we localized Mir-126 in the tunicate Ciona robusta. We found exclusive expression of the mature Mir-126 in granular amebocytes, supporting a long-proposed scenario that endothelial cells arose from hemoblasts, a type of proto-endothelial amoebocyte found throughout invertebrates. This observed change of expression of Mir-126 from proto-endothelial amoebocytes in the tunicate to endothelial cells in vertebrates is the first direct observation of the evolution of a cell-type in relation to microRNA expression indicating that microRNAs can be a prerequisite of cell-type evolution.
Assuntos
Células Endoteliais , MicroRNAs , Animais , Células Endoteliais/metabolismo , Vertebrados/genética , Invertebrados/genética , MicroRNAs/genética , MicroRNAs/metabolismoRESUMO
Since 2002, published miRNAs have been collected and named by the online repository miRBase. However, with 11 000 annual publications this has become challenging. Recently, four specialized miRNA databases were published, addressing particular needs for diverse scientific communities. This development provides major opportunities for the future of miRNA annotation and nomenclature.
Assuntos
Bases de Dados de Ácidos Nucleicos , Regulação da Expressão Gênica , MicroRNAs/genética , Anotação de Sequência Molecular/normas , Análise de Sequência de RNA/normas , Software , Genômica , HumanosRESUMO
Although microRNAs (miRNAs) are among the most intensively studied molecules of the past 20 years, determining what is and what is not a miRNA has not been straightforward. Here, we present a uniform system for the annotation and nomenclature of miRNA genes. We show that less than a third of the 1,881 human miRBase entries, and only approximately 16% of the 7,095 metazoan miRBase entries, are robustly supported as miRNA genes. Furthermore, we show that the human repertoire of miRNAs has been shaped by periods of intense miRNA innovation and that mature gene products show a very different tempo and mode of sequence evolution than star products. We establish a new open access database--MirGeneDB ( http://mirgenedb.org )--to catalog this set of miRNAs, which complements the efforts of miRBase but differs from it by annotating the mature versus star products and by imposing an evolutionary hierarchy upon this curated and consistently named repertoire.
Assuntos
Evolução Biológica , MicroRNAs/genética , Anotação de Sequência Molecular/métodos , Vertebrados/genética , Animais , Bases de Dados Genéticas , Evolução Molecular , Humanos , Terminologia como AssuntoRESUMO
Small non-coding RNAs have gained substantial attention due to their roles in animal development and human disorders. Among them, microRNAs are special because individual gene sequences are conserved across the animal kingdom. In addition, unique and mechanistically well understood features can clearly distinguish bona fide miRNAs from the myriad other small RNAs generated by cells. However, making this distinction is not a common practice and, thus, not surprisingly, the heterogeneous quality of available miRNA complements has become a major concern in microRNA research. We addressed this by extensively expanding our curated microRNA gene database - MirGeneDB - to 45 organisms, encompassing a wide phylogenetic swath of animal evolution. By consistently annotating and naming 10,899 microRNA genes in these organisms, we show that previous microRNA annotations contained not only many false positives, but surprisingly lacked >2000 bona fide microRNAs. Indeed, curated microRNA complements of closely related organisms are very similar and can be used to reconstruct ancestral miRNA repertoires. MirGeneDB represents a robust platform for microRNA-based research, providing deeper and more significant insights into the biology and evolution of miRNAs as well as biomedical and biomarker research. MirGeneDB is publicly and freely available at http://mirgenedb.org/.
Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , MicroRNAs/genética , Software , Navegador , Animais , Sequência Conservada , Evolução Molecular , MicroRNAs/classificação , Anotação de Sequência Molecular , Filogenia , Interface Usuário-ComputadorRESUMO
A lack of knowledge of the cellular origin of miRNAs has greatly confounded functional and biomarkers studies. Recently, three studies characterized miRNA expression patterns across >78 human cell types. These combined data expand our knowledge of miRNA expression localization and confirm that many miRNAs show cell type-specific expression patterns.
Assuntos
Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , MicroRNAs/genética , Animais , Células Eucarióticas/citologia , Células Eucarióticas/metabolismo , Humanos , Especificidade de Órgãos/genética , RNA Mensageiro/genéticaRESUMO
Testis-expressed X-linked genes typically evolve rapidly. Here, we report on a testis-expressed X-linked microRNA (miRNA) cluster that despite rapid alterations in sequence has retained its position in the Fragile-X region of the X chromosome in placental mammals. Surprisingly, the miRNAs encoded by this cluster (Fx-mir) have a predilection for targeting the immediately adjacent gene, Fmr1, an unexpected finding given that miRNAs usually act in trans, not in cis Robust repression of Fmr1 is conferred by combinations of Fx-mir miRNAs induced in Sertoli cells (SCs) during postnatal development when they terminate proliferation. Physiological significance is suggested by the finding that FMRP, the protein product of Fmr1, is downregulated when Fx-mir miRNAs are induced, and that FMRP loss causes SC hyperproliferation and spermatogenic defects. Fx-mir miRNAs not only regulate the expression of FMRP, but also regulate the expression of eIF4E and CYFIP1, which together with FMRP form a translational regulatory complex. Our results support a model in which Fx-mir family members act cooperatively to regulate the translation of batteries of mRNAs in a developmentally regulated manner in SCs.
Assuntos
Proteína do X Frágil da Deficiência Intelectual/genética , MicroRNAs/genética , Família Multigênica , Interferência de RNA , RNA Mensageiro/genética , Espermatogênese/genética , Regiões 3' não Traduzidas , Animais , Regulação da Expressão Gênica , Humanos , Masculino , Camundongos , Testículo/metabolismoRESUMO
Resource Description Framework (RDF) is one of the three standardized data formats in the HL7 Fast Healthcare Interoperability Resources (FHIR) specification and is being used by healthcare and research organizations to join FHIR and non-FHIR data. However, RDF previously had not been integrated into popular FHIR tooling packages, hindering the adoption of FHIR RDF in the semantic web and other communities. The objective of the study is to develop and evaluate a Java based FHIR RDF data transformation toolkit to facilitate the use and validation of FHIR RDF data. We extended the popular HAPI FHIR tooling to add RDF support, thus enabling FHIR data in XML or JSON to be transformed to or from RDF. We also developed an RDF Shape Expression (ShEx)-based validation framework to verify conformance of FHIR RDF data to the ShEx schemas provided in the FHIR specification for FHIR versions R4 and R5. The effectiveness of ShEx validation was demonstrated by testing it against 2693 FHIR R4 examples and 2197 FHIR R5 examples that are included in the FHIR specification. A total of 5 types of errors including missing properties, unknown element, missing resource Type, invalid attribute value, and unknown resource name in the R5 examples were revealed, demonstrating the value of the ShEx in the quality assurance of the evolving R5 development. This FHIR RDF data transformation and validation framework, based on HAPI and ShEx, is robust and ready for community use in adopting FHIR RDF, improving FHIR data quality, and evolving the FHIR specification.
Assuntos
Atenção à Saúde , Registros Eletrônicos de SaúdeRESUMO
The animal kingdom exhibits a great diversity of organismal form (i.e., disparity). Whether the extremes of disparity were achieved early in animal evolutionary history or clades continually explore the limits of possible morphospace is subject to continuing debate. Here we show, through analysis of the disparity of the animal kingdom, that, even though many clades exhibit maximal initial disparity, arthropods, chordates, annelids, echinoderms, and mollusks have continued to explore and expand the limits of morphospace throughout the Phanerozoic, expanding dramatically the envelope of disparity occupied in the Cambrian. The "clumpiness" of morphospace occupation by living clades is a consequence of the extinction of phylogenetic intermediates, indicating that the original distribution of morphologies was more homogeneous. The morphological distances between phyla mirror differences in complexity, body size, and species-level diversity across the animal kingdom. Causal hypotheses of morphologic expansion include time since origination, increases in genome size, protein repertoire, gene family expansion, and gene regulation. We find a strong correlation between increasing morphological disparity, genome size, and microRNA repertoire, but no correlation to protein domain diversity. Our results are compatible with the view that the evolution of gene regulation has been influential in shaping metazoan disparity whereas the invasion of terrestrial ecospace appears to represent an additional gestalt, underpinning the post-Cambrian expansion of metazoan disparity.
Assuntos
Biodiversidade , Evolução Biológica , Regulação da Expressão Gênica/fisiologia , Tamanho do Genoma/fisiologia , MicroRNAs/fisiologia , Animais , Fósseis , Proteínas/genéticaRESUMO
Free-text problem descriptions are brief explanations of patient diagnoses and issues, commonly found in problem lists and other prominent areas of the medical record. These compact representations often express complex and nuanced medical conditions, making their semantics challenging to fully capture and standardize. In this study, we describe a framework for transforming free-text problem descriptions into standardized Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) models. This approach leverages a combination of domain-specific dependency parsers, Bidirectional Encoder Representations from Transformers (BERT) natural language models, and cui2vec Unified Medical Language System (UMLS) concept vectors to align extracted concepts from free-text problem descriptions into structured FHIR models. A neural network classification model is used to classify thirteen relationship types between concepts, facilitating mapping to the FHIR Condition resource. We use data programming, a weak supervision approach, to eliminate the need for a manually annotated training corpus. Shapley values, a mechanism to quantify contribution, are used to interpret the impact of model features. We found that our methods identified the focus concept, or primary clinical concern of the problem description, with an F1 score of 0.95. Relationships from the focus to other modifying concepts were extracted with an F1 score of 0.90. When classifying relationships, our model achieved a 0.89 weighted average F1 score, enabling accurate mapping of attributes into HL7 FHIR models. We also found that the BERT input representation predominantly contributed to the classifier decision as shown by the Shapley values analysis.
Assuntos
Registros Eletrônicos de Saúde , Nível Sete de Saúde , Humanos , Padrões de Referência , Software , Unified Medical Language SystemRESUMO
BACKGROUND: Concept extraction, a subdomain of natural language processing (NLP) with a focus on extracting concepts of interest, has been adopted to computationally extract clinical information from text for a wide range of applications ranging from clinical decision support to care quality improvement. OBJECTIVES: In this literature review, we provide a methodology review of clinical concept extraction, aiming to catalog development processes, available methods and tools, and specific considerations when developing clinical concept extraction applications. METHODS: Based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a literature search was conducted for retrieving EHR-based information extraction articles written in English and published from January 2009 through June 2019 from Ovid MEDLINE In-Process & Other Non-Indexed Citations, Ovid MEDLINE, Ovid EMBASE, Scopus, Web of Science, and the ACM Digital Library. RESULTS: A total of 6,686 publications were retrieved. After title and abstract screening, 228 publications were selected. The methods used for developing clinical concept extraction applications were discussed in this review.
Assuntos
Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Bibliometria , Projetos de PesquisaRESUMO
Xenoturbellida and Acoelomorpha are marine worms with contentious ancestry. Both were originally associated with the flatworms (Platyhelminthes), but molecular data have revised their phylogenetic positions, generally linking Xenoturbellida to the deuterostomes and positioning the Acoelomorpha as the most basally branching bilaterian group(s). Recent phylogenomic data suggested that Xenoturbellida and Acoelomorpha are sister taxa and together constitute an early branch of Bilateria. Here we assemble three independent data sets-mitochondrial genes, a phylogenomic data set of 38,330 amino-acid positions and new microRNA (miRNA) complements-and show that the position of Acoelomorpha is strongly affected by a long-branch attraction (LBA) artefact. When we minimize LBA we find consistent support for a position of both acoelomorphs and Xenoturbella within the deuterostomes. The most likely phylogeny links Xenoturbella and Acoelomorpha in a clade we call Xenacoelomorpha. The Xenacoelomorpha is the sister group of the Ambulacraria (hemichordates and echinoderms). We show that analyses of miRNA complements have been affected by character loss in the acoels and that both groups possess one miRNA and the gene Rsb66 otherwise specific to deuterostomes. In addition, Xenoturbella shares one miRNA with the ambulacrarians, and two with the acoels. This phylogeny makes sense of the shared characteristics of Xenoturbellida and Acoelomorpha, such as ciliary ultrastructure and diffuse nervous system, and implies the loss of various deuterostome characters in the Xenacoelomorpha including coelomic cavities, through gut and gill slits.
Assuntos
Organismos Aquáticos/classificação , Filogenia , Canal Anal , Animais , Organismos Aquáticos/genética , Organismos Aquáticos/fisiologia , Teorema de Bayes , Etiquetas de Sequências Expressas , Brânquias , MicroRNAs/genética , Proteínas Mitocondriais/genéticaRESUMO
microRNAs (miRNAs) are a key component of gene regulatory networks and have been implicated in the regulation of virtually every biological process found in multicellular eukaryotes. What makes them interesting from a phylogenetic perspective is the high conservation of primary sequence between taxa, their accrual in metazoan genomes through evolutionary time, and the rarity of secondary loss in most metazoan taxa. Despite these properties, the use of miRNAs as phylogenetic markers has not yet been discussed within a clear conceptual framework. Here we highlight five properties of miRNAs that underlie their utility in phylogenetics: 1) The processes of miRNA biogenesis enable the identification of novel miRNAs without prior knowledge of sequence; 2) The continuous addition of miRNA families to metazoan genomes through evolutionary time; 3) The low level of secondary gene loss in most metazoan taxa; 4) The low substitution rate in the mature miRNA sequence; and 5) The small probability of convergent evolution of two miRNAs. Phylogenetic analyses using both Bayesian and parsimony methods on a eumetazoan miRNA data set highlight the potential of miRNAs to become an invaluable new tool, especially when used as an additional line of evidence, to resolve previously intractable nodes within the tree of life.
Assuntos
Evolução Molecular , MicroRNAs/genética , MicroRNAs/metabolismo , Filogenia , Animais , Sequência de Bases , Teorema de Bayes , Sequência Conservada , Redes Reguladoras de Genes , Genoma , Humanos , Metabolismo Secundário/genética , Especificidade da EspécieRESUMO
Understanding the phylogenetic position of crown turtles (Testudines) among amniotes has been a source of particular contention. Recent morphological analyses suggest that turtles are sister to all other reptiles, whereas the vast majority of gene sequence analyses support turtles as being inside Diapsida, and usually as sister to crown Archosauria (birds and crocodilians). Previously, a study using microRNAs (miRNAs) placed turtles inside diapsids, but as sister to lepidosaurs (lizards and Sphenodon) rather than archosaurs. Here, we test this hypothesis with an expanded miRNA presence/absence dataset, and employ more rigorous criteria for miRNA annotation. Significantly, we find no support for a turtle + lepidosaur sister-relationship; instead, we recover strong support for turtles sharing a more recent common ancestor with archosaurs. We further test this result by analyzing a super-alignment of precursor miRNA sequences for every miRNA inferred to have been present in the most recent common ancestor of tetrapods. This analysis yields a topology that is fully congruent with our presence/absence analysis; our results are therefore in accordance with most gene sequence studies, providing strong, consilient molecular evidence from diverse independent datasets regarding the phylogenetic position of turtles.
Assuntos
MicroRNAs/genética , Répteis/classificação , Répteis/genética , Animais , Aves/classificação , Aves/genética , FilogeniaRESUMO
The recent discovery of microRNAs (miRNAs) in unicellular eukaryotes, including miRNAs known previously only from animals or plants, implies that miRNAs have a deep evolutionary history among eukaryotes. This contrasts with the prevailing view that miRNAs evolved convergently in animals and plants. We re-evaluate the evidence and find that none of the 73 plant and animal miRNAs described from protists meet the required criteria for miRNA annotation and, by implication, animals and plants did not acquire any of their respective miRNA genes from the crown ancestor of eukaryotes. Furthermore, of the 159 novel miRNAs previously identified among the seven species of unicellular protists examined, only 28 from the algae Ectocarpus and Chlamydomonas, meet the criteria for miRNA annotation. Therefore, at present only five groups of eukaryotes are known to possess miRNAs, indicating that miRNAs have evolved independently within eukaryotes through exaptation of their shared inherited RNAi machinery.
Assuntos
Evolução Molecular , MicroRNAs/genética , Animais , Sequência de Bases , Humanos , MicroRNAs/metabolismo , MicroRNAs/fisiologia , Modelos Genéticos , Anotação de Sequência Molecular , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Filogenia , Interferência de RNA , RNA de Plantas/genética , RNA de Protozoário/genéticaRESUMO
Morphological data traditionally group Tardigrada (water bears), Onychophora (velvet worms), and Arthropoda (e.g., spiders, insects, and their allies) into a monophyletic group of invertebrates with walking appendages known as the Panarthropoda. However, molecular data generally do not support the inclusion of tardigrades within the Panarthropoda, but instead place them closer to Nematoda (roundworms). Here we present results from the analyses of two independent genomic datasets, expressed sequence tags (ESTs) and microRNAs (miRNAs), which congruently resolve the phylogenetic relationships of Tardigrada. Our EST analyses, based on 49,023 amino acid sites from 255 proteins, significantly support a monophyletic Panarthropoda including Tardigrada and suggest a sister group relationship between Arthropoda and Onychophora. Using careful experimental manipulations--comparisons of model fit, signal dissection, and taxonomic pruning--we show that support for a Tardigrada + Nematoda group derives from the phylogenetic artifact of long-branch attraction. Our small RNA libraries fully support our EST results; no miRNAs were found to link Tardigrada and Nematoda, whereas all panarthropods were found to share one unique miRNA (miR-276). In addition, Onychophora and Arthropoda were found to share a second miRNA (miR-305). Our study confirms the monophyly of the legged ecdysozoans, shows that past support for a Tardigrada + Nematoda group was due to long-branch attraction, and suggests that the velvet worms are the sister group to the arthropods.