RESUMO
SPLiT-seq provides a low-cost platform to generate single-cell data by labeling the cellular origin of RNA through four rounds of combinatorial barcoding. However, an automatic and rapid method for preprocessing and classifying single-cell sequencing (SCS) data from SPLiT-seq, which directly identified and labeled combinatorial barcoding reads and distinguished special cell sequencing data, is currently lacking. Here, we develop a high-efficiency preprocessing tool for single-cell sequencing data from SPLiT-seq (SCSit), which can directly identify combinatorial barcodes and UMI of cell types and obtain more labeled reads, and remarkably enhance the retained data from SCS due to the exact alignment of insertion and deletion. Compared with the original method used in SPLiT-seq, the consistency of identified reads from SCSit increases to 97%, and mapped reads are twice than the original. Furthermore, the runtime of SCSit is less than 10% of the original. It can accurately and rapidly analyze SPLiT-seq raw data and obtain labeled reads, as well as effectively improve the single-cell data from SPLiT-seq platform. The data and source of SCSit are available on the GitHub website https://github.com/shang-qian/SCSit.
RESUMO
[This corrects the article DOI: 10.3389/fgene.2020.00268.].
RESUMO
DNA 6mA modification, an important newly discovered epigenetic mark, plays a crucial role in organisms and has been attracting more and more attention in recent years. The soybean is economically the most important bean in the world, providing vegetable protein for millions of people. However, the distribution pattern and function of 6mA in soybean are still unknown. In this study, we decoded 6mA modification to single-nucleotide resolution in wild and cultivated soybeans, and compared the 6mA differences between cytoplasmic and nuclear genomes and between wild and cultivated soybeans. The motif of 6mA in the nuclear genome was conserved across the two kinds of soybeans, and ANHGA was the most dominant motif in wild and cultivated soybeans. Genes with 6mA modification in the nucleus had higher expression than those without modification. Interestingly, 6mA distribution patterns in cytoplasm for each soybean were significantly different from those in nucleus, which was reported for the first time in soybean. Our research provides a new insight in the deep analysis of cytoplasmic genomic DNA modification in plants.
RESUMO
N6-methyladenosine (6mA) DNA modification played an important role in epigenetic regulation of gene expression. And the aberrational expression of non-coding genes, as important regular elements of gene expression, was related to many diseases. However, the distribution and potential functions of 6mA modification in non-coding RNA (ncRNA) genes are still unknown. In this study, we analyzed the 6mA distribution of ncRNA genes and compared them with protein-coding genes in four species (Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens) using single-molecule real-time (SMRT) sequencing data. The results indicated that the consensus motifs of short nucleotides at 6mA location were highly conserved in four species, and the non-coding gene was less likely to be methylated compared with protein-coding gene. Especially, the 6mA-methylated lncRNA genes were expressed significant lower than genes without methylation in A. thaliana (p = 3.295e-4), D. melanogaster (p = 3.439e-11), and H. sapiens (p = 9.087e-3).. The detection and distribution profiling of 6mA modification in ncRNA regions from four species reveal that 6mA modifications may have effects on their expression level.
RESUMO
Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-read sequencing datasets has not been reported. Therefore, we developed an analysis workflow that combined two alignment tools (NGMLR and minimap2) and five callers (Sniffles, Picky, smartie-sv, PBHoney, and NanoSV) to evaluate the SV detection in six datasets of Saccharomyces cerevisiae. The accuracy of SV regions was validated by re-aligning raw reads in diverse alignment tools, SV callers, experimental conditions, and sequencing platforms. The results showed that SV detection between NGMLR and minimap2 was not significant when using the same caller. The PBHoney was with the highest average accuracy (89.04%) and Picky has the lowest average accuracy (35.85%). The accuracy of NanoSV, Sniffles, and smartie-sv was 68.67%, 60.47%, and 57.67%, respectively. In addition, smartie-sv and NanoSV detected the most and least number of SVs, and SV detection from the PacBio sequencing platform was significantly more than that from ONT (p = 0.000173).
RESUMO
BACKGROUND: DNA methylation is an important epigenetic modification. Recently the developed single-molecule real-time (SMRT) sequencing technology provided an efficient way to detect DNA N6-methyladenine (6mA) modification that played an important role in epigenetic and positively regulated gene expression. In addition, the gene expression was also regulated by genetic variation. However, the relationship between DNA 6mA modification and variation is still unknown. RESULTS: We collected the SMRT long-reads DNA, Illumina short reads DNA and RNA datasets from the young leaves of Herrania umbratica, and used them to detect 35,654 6mA modification sites, 829,894 DNA variations and 60,672 RNA variations respectively, among which, there are 303 DNA variations and 19 RNA variations with 6mA modification, and 57,468 transmitted genetic variations from DNA to RNA. The results illustrated that the genes with 6mA modification were significant disadvantage to mutate than those genes without modification (p-value< 4.9e-08). And result from the linear regression model showed the 6mA densities of genes were associated with the transmitted variations type 0/1 to 1/1 (p-value < 0.001). CONCLUSIONS: The variations of DNA and RNA in genes with 6mA modification were significant less than those in unmodified genes. Furthermore, the variations in 6mA modified genes were easily transmitted from DNA to RNA, especially the transmitted variation from DNA heterozygote to RNA homozygote.
Assuntos
Adenosina/análogos & derivados , DNA de Plantas/genética , DNA de Plantas/metabolismo , Variação Genética/genética , Genoma de Planta/genética , Magnoliopsida/genética , RNA de Plantas/genética , Adenosina/metabolismo , DNA Intergênico/genética , DNA Intergênico/metabolismo , DNA de Plantas/química , Heterozigoto , Homozigoto , Magnoliopsida/metabolismoRESUMO
Eukaryotic DNA methylation has been receiving increasing attention for its crucial epigenetic regulatory function. The recently developed single-molecule real-time (SMRT) sequencing technology provides an efficient way to detect DNA N6-methyladenine (6mA) and N4-methylcytosine (4mC) modifications at a single-nucleotide resolution. The family Rosaceae contains horticultural plants with a wide range of economic importance. However, little is currently known regarding the genome-wide distribution patterns and functions of 6mA and 4mC modifications in the Rosaceae. In this study, we present an integrated DNA 6mA and 4mC modification database for the Rosaceae (MDR, http://mdr.xieslab.org). MDR, the first repository for displaying and storing DNA 6mA and 4mC methylomes from SMRT sequencing data sets for Rosaceae, includes meta and statistical information, methylation densities, Gene Ontology enrichment analyses, and genome search and browse for methylated sites in NCBI. MDR provides important information regarding DNA 6mA and 4mC methylation and may help users better understand epigenetic modifications in the family Rosaceae.
RESUMO
N 6-methyladenine (6mA) DNA modification has been detected in several eukaryotic organisms, where it plays important roles in gene regulation and epigenetic memory maintenance. However, the genome-wide distribution patterns and potential functions of 6mA DNA modification in woodland strawberry (Fragaria vesca) remain largely unknown. Here, we examined the 6mA landscape in the F. vesca genome by adopting single-molecule real-time sequencing technology and found that 6mA modification sites were broadly distributed across the woodland strawberry genome. The pattern of 6mA distribution in the long non-coding RNA was significantly different from that in protein-coding genes. The 6mA modification influenced the gene transcription and was positively associated with gene expression, which was validated by computational and experimental analyses. Our study provides new insights into the DNA methylation in F. vesca.