Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 205
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nat Rev Mol Cell Biol ; 24(6): 430-447, 2023 06.
Article in English | MEDLINE | ID: mdl-36596869

ABSTRACT

Genes specifying long non-coding RNAs (lncRNAs) occupy a large fraction of the genomes of complex organisms. The term 'lncRNAs' encompasses RNA polymerase I (Pol I), Pol II and Pol III transcribed RNAs, and RNAs from processed introns. The various functions of lncRNAs and their many isoforms and interleaved relationships with other genes make lncRNA classification and annotation difficult. Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and development and other physiological processes. Many lncRNAs associate with chromatin-modifying complexes, are transcribed from enhancers and nucleate phase separation of nuclear condensates and domains, indicating an intimate link between lncRNA expression and the spatial control of gene expression during development. lncRNAs also have important roles in the cytoplasm and beyond, including in the regulation of translation, metabolism and signalling. lncRNAs often have a modular structure and are rich in repeats, which are increasingly being shown to be relevant to their function. In this Consensus Statement, we address the definition and nomenclature of lncRNAs and their conservation, expression, phenotypic visibility, structure and functions. We also discuss research challenges and provide recommendations to advance the understanding of the roles of lncRNAs in development, cell biology and disease.


Subject(s)
RNA, Long Noncoding , RNA, Long Noncoding/genetics , Cell Nucleus/genetics , Chromatin/genetics , Regulatory Sequences, Nucleic Acid , RNA Polymerase II/genetics
2.
Mol Cell ; 82(21): 4018-4032.e9, 2022 11 03.
Article in English | MEDLINE | ID: mdl-36332605

ABSTRACT

Kinetochore assembly on centromeres is central for chromosome segregation, and defects in this process cause mitotic errors and aneuploidy. Besides the well-established protein network, emerging evidence suggests the involvement of regulatory RNA in kinetochore assembly; however, it has remained elusive about the identity of such RNA, let alone its mechanism of action in this critical process. Here, we report CCTT, a previously uncharacterized long non-coding RNA (lncRNA) transcribed from the arm of human chromosome 17, which plays a vital role in kinetochore assembly. We show that CCTT highly localizes to all centromeres via the formation of RNA-DNA triplex and specifically interacts with CENP-C to help engage this blueprint protein in centromeres, and consequently, CCTT loss triggers extensive mitotic errors and aneuploidy. These findings uncover a non-centromere-derived lncRNA that recruits CENP-C to centromeres and shed critical lights on the function of centromeric DNA sequences as anchor points for kinetochore assembly.


Subject(s)
RNA, Long Noncoding , Humans , Aneuploidy , Centromere Protein A/metabolism , DNA , Kinetochores/metabolism , RNA, Long Noncoding/genetics , Centromere
3.
Nat Immunol ; 18(5): 499-508, 2017 05.
Article in English | MEDLINE | ID: mdl-28319097

ABSTRACT

Innate lymphoid cells (ILCs) communicate with other hematopoietic and nonhematopoietic cells to regulate immunity, inflammation and tissue homeostasis. How ILC lineages develop and are maintained remains largely unknown. In this study we observed that a divergent long noncoding RNA (lncRNA), lncKdm2b, was expressed at high levels in intestinal group 3 ILCs (ILC3s). LncKdm2b deficiency in the hematopoietic system led to reductions in the number and effector functions of ILC3s. LncKdm2b expression sustained the maintenance of ILC3s by promoting their proliferation through activation of the transcription factor Zfp292. Mechanistically, lncKdm2b recruited the chromatin organizer Satb1 and the nuclear remodeling factor (NURF) complex onto the Zfp292 promoter to initiate its transcription. Deletion of Zfp292 or Bptf also abrogated the maintenance of ILC3s, leading to susceptibility to bacterial infection. Therefore, our findings reveal that lncRNAs may represent an additional layer of regulation of ILC development and function.


Subject(s)
Bacterial Infections/genetics , F-Box Proteins/genetics , Immunity, Innate , Jumonji Domain-Containing Histone Demethylases/genetics , Lymphocytes/physiology , RNA, Long Noncoding/genetics , Animals , Antigens, Nuclear/genetics , Cell Differentiation/genetics , Cell Lineage/genetics , Cell Proliferation/genetics , Chromatin Assembly and Disassembly , DNA-Binding Proteins/genetics , Disease Susceptibility , Matrix Attachment Region Binding Proteins/genetics , Mice , Mice, Inbred C57BL , Mice, Knockout , Nerve Tissue Proteins/genetics , Transcription Factors/genetics , Transcriptional Activation
4.
Nature ; 621(7978): 336-343, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37674081

ABSTRACT

Birds are descended from non-avialan theropod dinosaurs of the Late Jurassic period, but the earliest phase of this evolutionary process remains unclear owing to the exceedingly sparse and spatio-temporally restricted fossil record1-5. Information about the early-diverging species along the avialan line is crucial to understand the evolution of the characteristic bird bauplan, and to reconcile phylogenetic controversies over the origin of birds3,4. Here we describe one of the stratigraphically youngest and geographically southernmost Jurassic avialans, Fujianvenator prodigiosus gen. et sp. nov., from the Tithonian age of China. This specimen exhibits an unusual set of morphological features that are shared with other stem avialans, troodontids and dromaeosaurids, showing the effects of evolutionary mosaicism in deep avialan phylogeny. F. prodigiosus is distinct from all other Mesozoic avialan and non-avialan theropods in having a particularly elongated hindlimb, suggestive of a terrestrial or wading lifestyle-in contrast with other early avialans, which exhibit morphological adaptations to arboreal or aerial environments. During our fieldwork in Zhenghe where F. prodigiosus was found, we discovered a diverse assemblage of vertebrates dominated by aquatic and semi-aquatic species, including teleosts, testudines and choristoderes. Using in situ radioisotopic dating and stratigraphic surveys, we were able to date the fossil-containing horizons in this locality-which we name the Zhenghe Fauna-to 148-150 million years ago. The diversity of the Zhenghe Fauna and its precise chronological framework will provide key insights into terrestrial ecosystems of the Late Jurassic.


Subject(s)
Birds , Dinosaurs , Fossils , Animals , China , Dinosaurs/anatomy & histology , Dinosaurs/classification , Ecosystem , Mosaicism , Phylogeny , Birds/anatomy & histology , Birds/classification , History, Ancient , Hindlimb
5.
Bioinformatics ; 40(6)2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38808568

ABSTRACT

MOTIVATION: There are many clustered transcriptionally active regions in the human genome, in which the transcription complex cannot immediately terminate transcription at the upstream gene termination site, but instead continues to transcribe intergenic regions and downstream genes, resulting in read-through transcripts. Several studies have demonstrated the regulatory roles of read-through transcripts in tumorigenesis and development. However, limited by the read length of next-generation sequencing, discovery of read-through transcripts has been slow. For long but also erroneous third-generation sequencing data, this study developed a novel minimizer sketch algorithm to accurately and quickly identify read-through transcripts. RESULTS: Readon initially splits the reference sequence into distinct active regions. It employs a sliding window approach within each region, calculates minimizers, and constructs the specialized structured arrays for query indexing. Following initial alignment anchor screening of candidate read-through transcripts, further confirmation steps are executed. Comparative assessments against existing software reveal Readon's superior performance on both simulated and validated real data. Additionally, two downstream tools are provided: one for predicting whether a read-through transcript is likely to undergo nonsense-mediated decay or encodes a protein, and another for visualizing splicing patterns. AVAILABILITY AND IMPLEMENTATION: Readon is freely available on GitHub (https://github.com/Bulabula45/Readon).


Subject(s)
Algorithms , High-Throughput Nucleotide Sequencing , Software , Humans , High-Throughput Nucleotide Sequencing/methods , Genome, Human , Sequence Analysis, RNA/methods
6.
Nucleic Acids Res ; 51(D1): D232-D239, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36373614

ABSTRACT

Noncoding RNAs (ncRNAs) play key regulatory roles in biological processes by interacting with other biomolecules. With the development of high-throughput sequencing and experimental technologies, extensive ncRNA interactions have been accumulated. Therefore, we updated the NPInter database to a fifth version to document these interactions. ncRNA interaction entries were doubled from 1 100 618 to 2 596 695 by manual literature mining and high-throughput data processing. We integrated global RNA-DNA interactions from iMARGI, ChAR-seq and GRID-seq, greatly expanding the number of RNA-DNA interactions (from 888 915 to 8 329 382). In addition, we collected different types of RNA interaction between SARS-CoV-2 virus and its host from recently published studies. Long noncoding RNA (lncRNA) expression specificity in different cell types from tumor single cell RNA-seq (scRNA-seq) data were also integrated to provide a cell-type level view of interactions. A new module named RBP was built to display the interactions of RNA-binding proteins with annotations of localization, binding domains and functions. In conclusion, NPInter v5.0 (http://bigdata.ibp.ac.cn/npinter5/) provides informative and valuable ncRNA interactions for biological researchers.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated , Humans , COVID-19/genetics , DNA/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , SARS-CoV-2/genetics , SARS-CoV-2/metabolism
7.
Nucleic Acids Res ; 50(D1): D265-D272, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34871445

ABSTRACT

Piwi-interacting RNAs are a type of small noncoding RNA that have various functions. piRBase is a manually curated resource focused on assisting piRNA functional analysis. piRBase release v3.0 is committed to providing more comprehensive piRNA related information. The latest release covers >181 million unique piRNA sequences, including 440 datasets from 44 species. More disease-related piRNAs and piRNA targets have been collected and displayed. The regulatory relationships between piRNAs and targets have been visualized. In addition to the reuse and expansion of the content in the previous version, the latest version has additional new content, including gold standard piRNA sets, piRNA clusters, piRNA variants, splicing-junction piRNAs, and piRNA expression data. In addition, the entire web interface has been redesigned to provide a better experience for users. piRBase release v3.0 is free to access, browse, search, and download at http://bigdata.ibp.ac.cn/piRBase.


Subject(s)
Databases, Nucleic Acid , Genome , RNA, Small Interfering/genetics , User-Computer Interface , Animals , Datasets as Topic , Humans , Internet , Molecular Sequence Annotation , Multigene Family , RNA Splicing , RNA, Small Interfering/classification , RNA, Small Interfering/metabolism
8.
EMBO J ; 38(17): e101110, 2019 09 02.
Article in English | MEDLINE | ID: mdl-31334575

ABSTRACT

Hepatocellular carcinoma (HCC) is the most prevalent liver cancer, characterized by a high rate of recurrence and heterogeneity. Liver cancer stem cells (CSCs) may well contribute to both of these pathological properties, but the mechanism underlying their self-renewal maintenance is poorly understood. Here, we identified a long noncoding RNA (lncRNA) termed HAND2-AS1 that is highly expressed in liver CSCs. Human HAND2-AS1 and its mouse ortholog lncHand2 display a high level of conservation. HAND2-AS1 is required for the self-renewal maintenance of liver CSCs to initiate HCC development. Mechanistically, HAND2-AS1 recruits the INO80 chromatin-remodeling complex to the promoter of BMPR1A, thereby inducing its expression and leading to the activation of BMP signaling. Importantly, interfering with expression of HAND2-AS1 by antisense oligonucleotides (ASOs) and BMPR1A by siRNAs has synergistic anti-tumorigenic effects on humanized HCC models. Moreover, knockout of lncHand2 or Bmpr1a in mouse hepatocytes impairs BMP signaling and suppresses the initiation of liver cancer. Our findings reveal that HAND2-AS1 promotes the self-renewal of liver CSCs and drives liver oncogenesis, offering a potential new target for HCC therapy.


Subject(s)
Carcinoma, Hepatocellular/genetics , Liver Neoplasms/genetics , Neoplastic Stem Cells/chemistry , RNA, Long Noncoding/genetics , Signal Transduction , ATPases Associated with Diverse Cellular Activities/genetics , Animals , Bone Morphogenetic Protein Receptors, Type I/genetics , Bone Morphogenetic Proteins/metabolism , Carcinoma, Hepatocellular/metabolism , Carcinoma, Hepatocellular/pathology , Cell Line, Tumor , Cell Self Renewal , DNA-Binding Proteins/genetics , Gene Expression Regulation, Neoplastic , Humans , Liver Neoplasms/metabolism , Liver Neoplasms/pathology , Mice , Neoplasm Transplantation , Neoplastic Stem Cells/pathology , Up-Regulation
9.
Genome Res ; 30(11): 1570-1582, 2020 11.
Article in English | MEDLINE | ID: mdl-33060173

ABSTRACT

Retrotransposons are populated in vertebrate genomes, and when active, are thought to cause genome instability with potential benefit to genome evolution. Retrotransposon-derived RNAs are also known to give rise to small endo-siRNAs to help maintain heterochromatin at their sites of transcription; however, as not all heterochromatic regions are equally active in transcription, it remains unclear how heterochromatin is maintained across the genome. Here, we address these problems by defining the origins of repeat-derived RNAs and their specific chromatin locations in Drosophila S2 cells. We demonstrate that repeat RNAs are predominantly derived from active gypsy elements and processed by Dcr-2 into small RNAs to help maintain pericentromeric heterochromatin. We also show in cultured S2 cells that synthetic repeat-derived endo-siRNA mimics are sufficient to rescue Dcr-2-deficiency-induced defects in heterochromatin formation in interphase and chromosome segregation during mitosis, demonstrating that active retrotransposons are required for stable genetic inheritance.


Subject(s)
Cell Division/genetics , Heterochromatin , Retroelements , Animals , Centromere , Chromosome Segregation , Drosophila/genetics , Drosophila Proteins/genetics , Euchromatin , High-Throughput Nucleotide Sequencing , RNA Helicases/genetics , RNA, Small Interfering , Ribonuclease III/genetics
10.
Mol Cell Proteomics ; 20: 100109, 2021.
Article in English | MEDLINE | ID: mdl-34129944

ABSTRACT

Many small ORFs embedded in long noncoding RNA (lncRNA) transcripts have been shown to encode biologically functional polypeptides (small ORF-encoded polypeptides [SEPs]) in different organisms. Despite some novel SEPs have been found, the identification is still hampered by their poor predictability, diminutive size, and low relative abundance. Here, we take advantage of NONCODE, a repository containing the most complete collection and annotation of lncRNA transcripts from different species, to build a novel database that attempts to maximize a collection of SEPs from human and mouse lncRNA transcripts. In order to further improve SEP discovery, we implemented two effective and complementary polypeptide enrichment strategies using 30-kDa molecular weight cutoff filter and C8 solid-phase extraction column. These combined strategies enabled us to discover 353 SEPs from eight human cell lines and 409 SEPs from three mouse cell lines and eight mouse tissues. Importantly, 19 of them were then verified through in vitro expression, immunoblotting, parallel reaction monitoring, and synthetic peptides. Subsequent bioinformatics analysis revealed that some of the physical and chemical properties of these novel SEPs, including amino acid composition and codon usage, are different from those commonly found in canonical proteins. Intriguingly, nearly 65% of the identified SEPs were found to be initiated with non-AUG start codons. The 762 novel SEPs probably represent the largest number of SEPs detected by MS reported to date. These novel SEPs might not only provide new clues for the annotation of noncoding elements in the genome but also serve as a valuable resource for functional study.


Subject(s)
Open Reading Frames , Peptides , RNA, Long Noncoding , Animals , Cell Line , Female , Humans , Male , Mass Spectrometry , Mice, Inbred C57BL
11.
Nucleic Acids Res ; 49(D1): D165-D171, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33196801

ABSTRACT

NONCODE (http://www.noncode.org/) is a comprehensive database of collection and annotation of noncoding RNAs, especially long non-coding RNAs (lncRNAs) in animals. NONCODEV6 is dedicated to providing the full scope of lncRNAs across plants and animals. The number of lncRNAs in NONCODEV6 has increased from 548 640 to 644 510 since the last update in 2017. The number of human lncRNAs has increased from 172 216 to 173 112. The number of mouse lncRNAs increased from 131 697 to 131 974. The number of plant lncRNAs is 94 697. The relationship between lncRNAs in human and cancer were updated with transcriptome sequencing profiles. Three important new features were also introduced in NONCODEV6: (i) updated human lncRNA-disease relationships, especially cancer; (ii) lncRNA annotations with tissue expression profiles and predicted function in five common plants; iii) lncRNAs conservation annotation at transcript level for 23 plant species. NONCODEV6 is accessible through http://www.noncode.org/.


Subject(s)
Databases, Nucleic Acid , Neoplasms/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Software , Transcriptome , Animals , Base Sequence , Conserved Sequence , Exons , Gene Expression Profiling , Humans , Internet , Mice , Molecular Sequence Annotation , Neoplasms/classification , Neoplasms/metabolism , Neoplasms/pathology , Plants/genetics , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , RNA, Messenger/classification , RNA, Messenger/metabolism
12.
Sichuan Da Xue Xue Bao Yi Xue Ban ; 54(5): 855-856, 2023 Sep.
Article in Zh | MEDLINE | ID: mdl-37866938

ABSTRACT

The application of big data technology combined with large language models is expected to make an enormous impact in the field of medicine. Herein, the prospects for the application of healthcare big data combined with large language models were discussed in several aspects, including first in assisting doctors in making diagnosis and differential diagnosis and, then, in the field of evidence-based medicine. In addition, healthcare big data combined with large language models could also be applied in assisting doctors to conduct clinical and medical research. Through combining healthcare big data with large language models, medical diagnosis and treatment with improved precision, efficiency, and intelligence will be realized and greater contributions will be made to the field of human health.


Subject(s)
Big Data , Biomedical Research , Humans , Artificial Intelligence , Delivery of Health Care
13.
EMBO J ; 37(8)2018 04 13.
Article in English | MEDLINE | ID: mdl-29535137

ABSTRACT

Divergent long noncoding RNAs (lncRNAs) represent a major lncRNA biotype in mouse and human genomes. The biological and molecular functions of the divergent lncRNAs remain largely unknown. Here, we show that lncKdm2b, a divergent lncRNA for Kdm2b gene, is conserved among five mammalian species and highly expressed in embryonic stem cells (ESCs) and early embryos. LncKdm2b knockout impairs ESC self-renewal and causes early embryonic lethality. LncKdm2b can activate Zbtb3 by promoting the assembly and ATPase activity of Snf2-related CREBBP activator protein (SRCAP) complex in trans Zbtb3 potentiates the ESC self-renewal in a Nanog-dependent manner. Finally, Zbtb3 deficiency impairs the ESC self-renewal and early embryonic development. Therefore, our findings reveal that lncRNAs may represent an additional layer of the regulation of ESC self-renewal and early embryogenesis.


Subject(s)
DNA-Binding Proteins/genetics , Embryonic Stem Cells/metabolism , RNA, Long Noncoding/genetics , Animals , Embryonic Development , Humans , Mice, Knockout
14.
Genome Res ; 29(9): 1521-1532, 2019 09.
Article in English | MEDLINE | ID: mdl-31315906

ABSTRACT

Long noncoding RNAs (lncRNAs) can regulate the activity of target genes by participating in the organization of chromatin architecture. We have devised a "chromatin-RNA in situ reverse transcription sequencing" (CRIST-seq) approach to profile the lncRNA interaction network in gene regulatory elements by combining the simplicity of RNA biotin labeling with the specificity of the CRISPR/Cas9 system. Using gene-specific gRNAs, we describe a pluripotency-specific lncRNA interacting network in the promoters of Sox2 and Pou5f1, two critical stem cell factors that are required for the maintenance of pluripotency. The promoter-interacting lncRNAs were specifically activated during reprogramming into pluripotency. Knockdown of these lncRNAs caused the stem cells to exit from pluripotency. In contrast, overexpression of the pluripotency-associated lncRNA activated the promoters of core stem cell factor genes and enhanced fibroblast reprogramming into pluripotency. These CRIST-seq data suggest that the Sox2 and Pou5f1 promoters are organized within a unique lncRNA interaction network that determines the fate of pluripotency during reprogramming. This CRIST approach may be broadly used to map lncRNA interaction networks at target loci across the genome.


Subject(s)
Chromatin/genetics , Octamer Transcription Factor-3/genetics , RNA, Long Noncoding/genetics , SOXB1 Transcription Factors/genetics , Sequence Analysis, RNA/methods , Animals , CRISPR-Cas Systems , Cell Line , Cellular Reprogramming , Fibroblasts/cytology , Fibroblasts/metabolism , Mice , Pluripotent Stem Cells/cytology , Pluripotent Stem Cells/metabolism , Promoter Regions, Genetic , Regulatory Sequences, Nucleic Acid
15.
Nucleic Acids Res ; 48(D1): D233-D237, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31713629

ABSTRACT

Spatially resolved transcriptomic techniques allow the characterization of spatial organization of cells in tissues, which revolutionize the studies of tissue function and disease pathology. New strategies for detecting spatial gene expression patterns are emerging, and spatially resolved transcriptomic data are accumulating rapidly. However, it is not convenient for biologists to exploit these data due to the diversity of strategies and complexity in data analysis. Here, we present SpatialDB, the first manually curated database for spatially resolved transcriptomic techniques and datasets. The current version of SpatialDB contains 24 datasets (305 sub-datasets) from 5 species generated by 8 spatially resolved transcriptomic techniques. SpatialDB provides a user-friendly web interface for visualization and comparison of spatially resolved transcriptomic data. To further explore these data, SpatialDB also provides spatially variable genes and their functional enrichment annotation. SpatialDB offers a repository for research community to investigate the spatial cellular structure of tissues, and may bring new insights into understanding the cellular microenvironment in disease. SpatialDB is freely available at https://www.spatialomics.org/SpatialDB.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Animals , Humans , Mice , Transcriptome
16.
Nucleic Acids Res ; 48(D1): D160-D165, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31670377

ABSTRACT

Noncoding RNAs (ncRNAs) play crucial regulatory roles in a variety of biological circuits. To document regulatory interactions between ncRNAs and biomolecules, we previously created the NPInter database (http://bigdata.ibp.ac.cn/npinter). Since the last version of NPInter was issued, a rapidly growing number of studies have reported novel interactions and accumulated numerous high-throughput interactome data. We have therefore updated NPInter to its fourth edition in which are integrated 600 000 new experimentally identified ncRNA interactions. ncRNA-DNA interactions derived from ChIRP-seq data and circular RNA interactions have been included in the database. Additionally, disease associations were annotated to the interacting molecules. The database website has also been redesigned with a more user-friendly interface and several additional functional modules. Overall, NPInter v4.0 now provides more comprehensive data and services for researchers working on ncRNAs and their interactions with other biomolecules.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/metabolism , DNA/metabolism , Disease/genetics , Humans , MicroRNAs/metabolism , RNA, Circular/metabolism
17.
BMC Genomics ; 22(1): 243, 2021 Apr 07.
Article in English | MEDLINE | ID: mdl-33827435

ABSTRACT

BACKGROUND: Altica (Coleoptera: Chrysomelidae) is a highly diverse and taxonomically challenging flea beetle genus that has been used to address questions related to host plant specialization, reproductive isolation, and ecological speciation. To further evolutionary studies in this interesting group, here we present a draft genome of a representative specialist, Altica viridicyanea, the first Alticinae genome reported thus far. RESULTS: The genome is 864.8 Mb and consists of 4490 scaffolds with a N50 size of 557 kb, which covered 98.6% complete and 0.4% partial insect Benchmarking Universal Single-Copy Orthologs. Repetitive sequences accounted for 62.9% of the assembly, and a total of 17,730 protein-coding gene models and 2462 non-coding RNA models were predicted. To provide insight into host plant specialization of this monophagous species, we examined the key gene families involved in chemosensation, detoxification of plant secondary chemistry, and plant cell wall-degradation. CONCLUSIONS: The genome assembled in this work provides an important resource for further studies on host plant adaptation and functionally affiliated genes. Moreover, this work also opens the way for comparative genomics studies among closely related Altica species, which may provide insight into the molecular evolutionary processes that occur during ecological speciation.


Subject(s)
Coleoptera , Siphonaptera , Animals , Coleoptera/genetics , Evolution, Molecular , Genome , Genomics
18.
Nucleic Acids Res ; 47(D1): D175-D180, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30371818

ABSTRACT

PIWI-interacting RNAs are a class of small RNAs that is most abundantly expressed in animal germline. Substantial research is going on to reveal the functions of piRNAs in the epigenetic and post-transcriptional regulation of transposons and genes. To collect and annotate these data, we developed piRBase, a database assisting piRNA functional study. Since its launch in 2014, piRBase has integrated 264 data sets from 21 organisms, and the number of collected piRNAs has reached 173 million. The latest piRBase release (v2.0, 2018) was more focused on the comprehensive annotation of piRNA sequences, as well as the increasing number of piRNAs. In addition, piRBase release v2.0 also contained the potential information of piRNA targets and disease related piRNA. All datasets in piRBase is free to access, and available for browse, search and bulk downloads at http://www.regulatoryrna.org/database/piRNA/.


Subject(s)
Base Sequence , Computational Biology/methods , Databases, Genetic , Genomics/methods , RNA, Small Interfering/genetics , Gene Expression Regulation , RNA, Small Interfering/chemistry , Software , Web Browser
19.
Brief Bioinform ; 19(4): 636-643, 2018 07 20.
Article in English | MEDLINE | ID: mdl-28137767

ABSTRACT

Small proteins is the general term for proteins with length shorter than 100 amino acids. Identification and functional studies of small proteins have advanced rapidly in recent years, and several studies have shown that small proteins play important roles in diverse functions including development, muscle contraction and DNA repair. Identification and characterization of previously unrecognized small proteins may contribute in important ways to cell biology and human health. Current databases are generally somewhat deficient in that they have either not collected small proteins systematically, or contain only predictions of small proteins in a limited number of tissues and species. Here, we present a specifically designed web-accessible database, small proteins database (SmProt, http://bioinfo.ibp.ac.cn/SmProt), which is a database documenting small proteins. The current release of SmProt incorporates 255 010 small proteins computationally or experimentally identified in 291 cell lines/tissues derived from eight popular species. The database provides a variety of data including basic information (sequence, location, gene name, organism, etc.) as well as specific information (experiment, function, disease type, etc.). To facilitate data extraction, SmProt supports multiple search options, including species, genome location, gene name and their aliases, cell lines/tissues, ORF type, gene type, PubMed ID and SmProt ID. SmProt also incorporates a service for the BLAST alignment search and provides a local UCSC Genome Browser. Additionally, SmProt defines a high-confidence set of small proteins and predicts the functions of the small proteins.


Subject(s)
Codon , Databases, Factual , Molecular Sequence Annotation , Proteins/genetics , RNA, Untranslated/genetics , RNA/genetics , Software , Humans , Proteins/metabolism
20.
Brief Bioinform ; 19(6): 1302-1309, 2018 11 27.
Article in English | MEDLINE | ID: mdl-28575155

ABSTRACT

Biological processes, especially developmental processes, are often dynamic. Previous BodyMap projects for human and mouse have provided researchers with portals to tissue-specific gene expression, but these efforts have not included dynamic gene expression patterns. Over the past few years, substantial progress in our understanding of the molecular mechanisms of protein-coding and long noncoding RNA (lncRNA) genes in development processes has been achieved through numerous time series RNA sequencing (RNA-seq) studies. However, none of the existing databases focuses on these time series data, thus rendering the exploration of dynamic gene expression patterns inconvenient. Here, we present Dynamic BodyMap (Dynamic-BM), a database for temporal gene expression profiles, obtained from 2203 time series of RNA-seq samples, covering >25 tissues from five species. Dynamic-BM has a user-friendly Web interface designed for browsing and searching the dynamic expression pattern of genes from different sources. It is an open resource for efficient data exploration, providing dynamic expression profiles of both protein-coding genes and lncRNAs to facilitate the generation of new hypotheses in developmental biology research. Additionally, Dynamic-BM includes a literature-based knowledgebase for lncRNAs associated with tissue development and a list of manually selected lncRNA candidates that may be involved in tissue development. Dynamic-BM is available at http://bioinfo.ibp.ac.cn/Dynamic-BM.


Subject(s)
Databases, Factual , Sequence Analysis, RNA/methods , Gene Expression Profiling , Internet , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL