RESUMEN
Cells contain hundreds of organelles and macromolecular assemblies. Obtaining a complete understanding of their intricate organization requires the nanometre-level, three-dimensional reconstruction of whole cells, which is only feasible with robust and scalable automatic methods. Here, to support the development of such methods, we annotated up to 35 different cellular organelle classes-ranging from endoplasmic reticulum to microtubules to ribosomes-in diverse sample volumes from multiple cell types imaged at a near-isotropic resolution of 4 nm per voxel with focused ion beam scanning electron microscopy (FIB-SEM)1. We trained deep learning architectures to segment these structures in 4 nm and 8 nm per voxel FIB-SEM volumes, validated their performance and showed that automatic reconstructions can be used to directly quantify previously inaccessible metrics including spatial interactions between cellular components. We also show that such reconstructions can be used to automatically register light and electron microscopy images for correlative studies. We have created an open data and open-source web repository, 'OpenOrganelle', to share the data, computer code and trained models, which will enable scientists everywhere to query and further improve automatic reconstruction of these datasets.
Asunto(s)
Microscopía Electrónica de Rastreo/métodos , Microscopía Electrónica de Rastreo/normas , Orgánulos/ultraestructura , Animales , Biomarcadores/análisis , Células COS , Tamaño de la Célula , Chlorocebus aethiops , Conjuntos de Datos como Asunto , Aprendizaje Profundo , Retículo Endoplásmico , Células HeLa , Humanos , Difusión de la Información , Microscopía Fluorescente , Microtúbulos , Reproducibilidad de los Resultados , RibosomasRESUMEN
BACKGROUND: Neuroscience research in Drosophila is benefiting from large-scale connectomics efforts using electron microscopy (EM) to reveal all the neurons in a brain and their connections. To exploit this knowledge base, researchers relate a connectome's structure to neuronal function, often by studying individual neuron cell types. Vast libraries of fly driver lines expressing fluorescent reporter genes in sets of neurons have been created and imaged using confocal light microscopy (LM), enabling the targeting of neurons for experimentation. However, creating a fly line for driving gene expression within a single neuron found in an EM connectome remains a challenge, as it typically requires identifying a pair of driver lines where only the neuron of interest is expressed in both. This task and other emerging scientific workflows require finding similar neurons across large data sets imaged using different modalities. RESULTS: Here, we present NeuronBridge, a web application for easily and rapidly finding putative morphological matches between large data sets of neurons imaged using different modalities. We describe the functionality and construction of the NeuronBridge service, including its user-friendly graphical user interface (GUI), extensible data model, serverless cloud architecture, and massively parallel image search engine. CONCLUSIONS: NeuronBridge fills a critical gap in the Drosophila research workflow and is used by hundreds of neuroscience researchers around the world. We offer our software code, open APIs, and processed data sets for integration and reuse, and provide the application as a service at http://neuronbridge.janelia.org .
Asunto(s)
Conectoma , Programas Informáticos , Animales , Neuronas , Microscopía Electrónica , DrosophilaRESUMEN
The multidisciplinary treatment of a 41-year-old man with cleidocranial dysplasia is described. A rapid external distraction device was used to reposition the maxilla before the prosthodontic rehabilitation.
Asunto(s)
Displasia Cleidocraneal , Prótesis Dental de Soporte Implantado , Maxilar , Adulto , Humanos , Masculino , Displasia Cleidocraneal/diagnóstico por imagen , Displasia Cleidocraneal/cirugía , Maxilar/cirugía , Resultado del TratamientoRESUMEN
Pairwise sequence covariations are a signal of conserved RNA secondary structure. We describe a method for distinguishing when lack of covariation signal can be taken as evidence against a conserved RNA structure, as opposed to when a sequence alignment merely has insufficient variation to detect covariations. We find that alignments for several long non-coding RNAs previously shown to lack covariation support do have adequate covariation detection power, providing additional evidence against their proposed conserved structures. AVAILABILITY AND IMPLEMENTATION: The R-scape web server is at eddylab.org/R-scape, with a link to download the source code. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
ARN Largo no Codificante , ARN , Algoritmos , Secuencia Conservada , Conformación de Ácido Nucleico , ARN/genética , Alineación de Secuencia , Análisis de Secuencia de ARN , Programas InformáticosRESUMEN
Many functional RNAs have an evolutionarily conserved secondary structure. Conservation of RNA base pairing induces pairwise covariations in sequence alignments. We developed a computational method, R-scape (RNA Structural Covariation Above Phylogenetic Expectation), that quantitatively tests whether covariation analysis supports the presence of a conserved RNA secondary structure. R-scape analysis finds no statistically significant support for proposed secondary structures of the long noncoding RNAs HOTAIR, SRA, and Xist.
Asunto(s)
Evolución Molecular , Filogenia , ARN Largo no Codificante/química , ARN Largo no Codificante/genética , Emparejamiento Base , Secuencia de Bases , Humanos , Conformación de Ácido NucleicoRESUMEN
Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.
Asunto(s)
Elementos Transponibles de ADN , ADN/química , Bases de Datos de Ácidos Nucleicos , Secuencias Repetitivas de Ácidos Nucleicos , Animales , ADN/clasificación , Genoma , Humanos , Internet , Cadenas de Markov , Ratones , Anotación de Secuencia Molecular , Alineación de SecuenciaRESUMEN
STATEMENT OF PROBLEM: Denture tooth fracture may limit the longevity of dental prostheses. Whether the strength of the denture tooth material is affected by the denture processing technique is unclear. PURPOSE: The purpose of this in vitro study was to investigate whether the denture processing technique affects the mechanical properties of denture tooth materials. MATERIAL AND METHODS: Two denture processing techniques, injection and compression molding, were tested for 3 types of denture teeth: nanohybrid composite (NHC), interpenetrating network (IPN), and microfiller-reinforced polyacrylic (MRP). Denture teeth were processed by using an injection-molded resin or a compression-molded resin. Unprocessed denture teeth served as the control. After teeth were processed, they were sectioned into rectangular beams for 3-point bend testing (n=20 to 24). Elastic moduli were determined from load deflection and maximum stress from maximum bending load. The results were statistically analyzed by using 2-way ANOVA and multiple comparisons (α=.05). RESULTS: The processing technique and the type of denture tooth affected both the elastic modulus and the maximum stress. The injection-molded technique resulted in significantly higher (24% to 26%) elastic modulus for NHC and IPN (12% higher in MRP, but not statistically significant) and higher (12% to 17%) maximum stresses for IPN and MRP (3% lower in NHC, but not statistically significant). Compression-molded technique increased the elastic modulus of IPN and NHC by 10% to 17% (3% lower in MRP but not statistically significant), but maximum stresses were not statistically significantly different in any of the tested teeth. Regardless of processing, MRP teeth had the highest elastic modulus (8.0 to 9.2 GPa) but the lowest maximum stresses (97 to 124 MPa), whereas IPN teeth had the lowest elastic modulus (5.5 GPa) but high or highest maximum stress (171 to 192 MPa). CONCLUSIONS: The injection-molded technique significantly increased the elastic modulus of NHC and IPN teeth and significantly increased the maximum stress of IPN teeth. The compression-molded technique did not significantly affect mechanical properties of denture teeth.
Asunto(s)
Materiales Dentales/química , Análisis del Estrés Dental , Bases para Dentadura , Diseño de Dentadura/métodos , Dentaduras , Módulo de Elasticidad , Resinas Acrílicas , Análisis de Varianza , Fenómenos Químicos , Química Física , Fuerza Compresiva , Humanos , Inyecciones , Ensayo de Materiales , Presión , Estrés MecánicoRESUMEN
The HMMER website, available at http://www.ebi.ac.uk/Tools/hmmer/, provides access to the protein homology search algorithms found in the HMMER software suite. Since the first release of the website in 2011, the search repertoire has been expanded to include the iterative search algorithm, jackhmmer. The continued growth of the target sequence databases means that traditional tabular representations of significant sequence hits can be overwhelming to the user. Consequently, additional ways of presenting homology search results have been developed, allowing them to be summarised according to taxonomic distribution or domain architecture. The taxonomy and domain architecture representations can be used in combination to filter the results according to the needs of a user. Searches can also be restricted prior to submission using a new taxonomic filter, which not only ensures that the results are specific to the requested taxonomic group, but also improves search performance. The repertoire of profile hidden Markov model libraries, which are used for annotation of query sequences with protein families and domains, has been expanded to include the libraries from CATH-Gene3D, PIRSF, Superfamily and TIGRFAMs. Finally, we discuss the relocation of the HMMER webserver to the European Bioinformatics Institute and the potential impact that this will have.
Asunto(s)
Homología de Secuencia de Aminoácido , Programas Informáticos , Algoritmos , Bases de Datos de Proteínas , Internet , Cadenas de Markov , Estructura Terciaria de Proteína , Alineación de Secuencia , Análisis de Secuencia de ProteínaRESUMEN
The database iPfam, available at http://ipfam.org, catalogues Pfam domain interactions based on known 3D structures that are found in the Protein Data Bank, providing interaction data at the molecular level. Previously, the iPfam domain-domain interaction data was integrated within the Pfam database and website, but it has now been migrated to a separate database. This allows for independent development, improving data access and giving clearer separation between the protein family and interactions datasets. In addition to domain-domain interactions, iPfam has been expanded to include interaction data for domain bound small molecule ligands. Functional annotations are provided from source databases, supplemented by the incorporation of Wikipedia articles where available. iPfam (version 1.0) contains >9500 domain-domain and 15 500 domain-ligand interactions. The new website provides access to this data in a variety of ways, including interactive visualizations of the interaction data.
Asunto(s)
Bases de Datos de Proteínas , Dominios y Motivos de Interacción de Proteínas , Internet , Ligandos , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Proteínas/clasificación , Programas InformáticosRESUMEN
Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.
Asunto(s)
Bases de Datos de Proteínas , Alineación de Secuencia , Análisis de Secuencia de Proteína , Internet , Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Proteoma/química , Análisis de Secuencia de ADNRESUMEN
We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.
Asunto(s)
Elementos Transponibles de ADN , Bases de Datos de Ácidos Nucleicos , Genoma Humano , Humanos , Internet , Cadenas de Markov , Modelos Estadísticos , Anotación de Secuencia MolecularRESUMEN
BACKGROUND: Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where the height of the stack corresponds to the conservation at that position, and the height of each letter within a stack depends on the frequency of that letter at that position. RESULTS: We present a new tool and web server, called Skylign, which provides a unified framework for creating logos for both sequence alignments and profile hidden Markov models. In addition to static image files, Skylign creates a novel interactive logo plot for inclusion in web pages. These interactive logos enable scrolling, zooming, and inspection of underlying values. Skylign can avoid sampling bias in sequence alignments by down-weighting redundant sequences and by combining observed counts with informed priors. It also simplifies the representation of gap parameters, and can optionally scale letter heights based on alternate calculations of the conservation of a position. CONCLUSION: Skylign is available as a website, a scriptable web service with a RESTful interface, and as a software package for download. Skylign's interactive logos are easily incorporated into a web page with just a few lines of HTML markup. Skylign may be found at http://skylign.org.
Asunto(s)
Biología Computacional/métodos , Internet , Alineación de Secuencia/métodos , Análisis de Secuencia/métodos , Programas Informáticos , Secuencia de Aminoácidos , Secuencia de Bases , Gráficos por Computador , ADN/química , Datos de Secuencia MolecularRESUMEN
Pfam is a widely used database of protein families, currently containing more than 13,000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the 'sunburst' representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.
Asunto(s)
Bases de Datos de Proteínas , Proteínas/clasificación , Enciclopedias como Asunto , Internet , Estructura Terciaria de Proteína , Homología de Secuencia de AminoácidoRESUMEN
We examined the coding sequence of 518 protein kinases, approximately 1.3 Mb of DNA per sample, in 25 breast cancers. In many tumors, we detected no somatic mutations. But a few had numerous somatic mutations with distinctive patterns indicative of either a mutator phenotype or a past exposure.
Asunto(s)
Neoplasias de la Mama/genética , Carcinoma Ductal de Mama/genética , Mutación , Proteínas Quinasas/genética , Anciano , Análisis Mutacional de ADN , Femenino , Humanos , Familia de MultigenesRESUMEN
Understanding the cell-type composition and spatial organization of brain regions is crucial for interpreting brain computation and function. In the thalamus, the anterior thalamic nuclei (ATN) are involved in a wide variety of functions, yet the cell-type composition of the ATN remains unmapped at a single-cell and spatial resolution. Combining single-cell RNA sequencing, spatial transcriptomics, and multiplexed fluorescent in situ hybridization, we identify three discrete excitatory cell-type clusters that correspond to the known nuclei of the ATN and uncover marker genes, molecular pathways, and putative functions of these cell types. We further illustrate graded spatial variation along the dorsomedial-ventrolateral axis for all individual nuclei of the ATN and additionally demonstrate that the anteroventral nucleus exhibits spatially covarying protein products and long-range inputs. Collectively, our study reveals discrete and continuous cell-type organizational principles of the ATN, which will help to guide and interpret experiments on ATN computation and function.
Asunto(s)
Núcleos Talámicos Anteriores , Animales , Ratones , Núcleos Talámicos Anteriores/metabolismo , Hibridación Fluorescente in SituRESUMEN
Vision provides animals with detailed information about their surroundings, conveying diverse features such as color, form, and movement across the visual scene. Computing these parallel spatial features requires a large and diverse network of neurons, such that in animals as distant as flies and humans, visual regions comprise half the brain's volume. These visual brain regions often reveal remarkable structure-function relationships, with neurons organized along spatial maps with shapes that directly relate to their roles in visual processing. To unravel the stunning diversity of a complex visual system, a careful mapping of the neural architecture matched to tools for targeted exploration of that circuitry is essential. Here, we report a new connectome of the right optic lobe from a male Drosophila central nervous system FIB-SEM volume and a comprehensive inventory of the fly's visual neurons. We developed a computational framework to quantify the anatomy of visual neurons, establishing a basis for interpreting how their shapes relate to spatial vision. By integrating this analysis with connectivity information, neurotransmitter identity, and expert curation, we classified the ~53,000 neurons into 727 types, about half of which are systematically described and named for the first time. Finally, we share an extensive collection of split-GAL4 lines matched to our neuron type catalog. Together, this comprehensive set of tools and data unlock new possibilities for systematic investigations of vision in Drosophila, a foundation for a deeper understanding of sensory processing.
RESUMEN
Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274 megabases (Mb) of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers. There was substantial variation in the number and pattern of mutations in individual cancers reflecting different exposures, DNA repair defects and cellular origins. Most somatic mutations are likely to be 'passengers' that do not contribute to oncogenesis. However, there was evidence for 'driver' mutations contributing to the development of the cancers studied in approximately 120 genes. Systematic sequencing of cancer genomes therefore reveals the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.
Asunto(s)
Genes Relacionados con las Neoplasias/genética , Genoma Humano/genética , Genómica , Mutación/genética , Neoplasias/genética , Secuencia de Aminoácidos , Análisis Mutacional de ADN , Humanos , Datos de Secuencia Molecular , Proteínas de Neoplasias/química , Proteínas de Neoplasias/genética , Proteínas Quinasas/química , Proteínas Quinasas/genéticaRESUMEN
HMMER is a software suite for protein sequence similarity searches using probabilistic methods. Previously, HMMER has mainly been available only as a computationally intensive UNIX command-line tool, restricting its use. Recent advances in the software, HMMER3, have resulted in a 100-fold speed gain relative to previous versions. It is now feasible to make efficient profile hidden Markov model (profile HMM) searches via the web. A HMMER web server (http://hmmer.janelia.org) has been designed and implemented such that most protein database searches return within a few seconds. Methods are available for searching either a single protein sequence, multiple protein sequence alignment or profile HMM against a target sequence database, and for searching a protein sequence against Pfam. The web server is designed to cater to a range of different user expertise and accepts batch uploading of multiple queries at once. All search methods are also available as RESTful web services, thereby allowing them to be readily integrated as remotely executed tasks in locally scripted workflows. We have focused on minimizing search times and the ability to rapidly display tabular results, regardless of the number of matches found, developing graphical summaries of the search results to provide quick, intuitive appraisement of them.
Asunto(s)
Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína , Programas Informáticos , InternetRESUMEN
The laminae of the neocortex are fundamental processing layers of the mammalian brain. Notably, such laminae are believed to be relatively stereotyped across short spatial scales such that shared laminae between nearby brain regions exhibit similar constituent cells. Here, we consider a potential exception to this rule by studying the retrosplenial cortex (RSC), a brain region known for sharp cytoarchitectonic differences across its granular-dysgranular border. Using a variety of transcriptomics techniques, we identify, spatially map, and interpret the excitatory cell-type landscape of the mouse RSC. In doing so, we uncover that RSC gene expression and cell types change sharply at the granular-dysgranular border. Additionally, supposedly homologous laminae between the RSC and the neocortex are effectively wholly distinct in their cell-type composition. In collection, the RSC exhibits a variety of intrinsic cell-type specializations and embodies an organizational principle wherein cell-type identities can vary sharply within and between brain regions.