Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

DeepBAM: a high-accuracy single-molecule CpG methylation detection tool for Oxford nanopore sequencing.

Bai, Xin; Yao, Hui-Cong; Wu, Bo; Liu, Luo-Ran; Ding, Yu-Ying; Xiao, Chuan-Le.

Brief Bioinform ; 25(5)2024 Jul 25.

Artigo em Inglês | MEDLINE | ID: mdl-39177264

RESUMO

Recent nanopore sequencing system (R10.4) has enhanced base calling accuracy and is being increasingly utilized for detecting CpG methylation state. However, the robustness and universality of the methylation calling model in officially supplied Dorado remains poorly tested. In this study, we obtained heterogeneous datasets from human and plant sources to carry out comprehensive evaluations, which showed that Dorado performed significantly different across datasets. We therefore developed deep neural networks and implemented several optimizations in training a new model called DeepBAM. DeepBAM achieved superior and more stable performances compared with Dorado, including higher area under the ROC curves (98.47% on average and up to 7.36% improvement) and F1 scores (94.97% on average and up to 16.24% improvement) across the datasets. DeepBAM-based whole genome methylation frequencies have achieved >0.95 correlations with BS-seq on four of five datasets, outperforming Dorado in all instances. It enables unraveling allele-specific methylation patterns, including regions of transposable elements. The enhanced performance of DeepBAM paves the way for broader applications of nanopore sequencing in CpG methylation studies.

Assuntos

Ilhas de CpG , Metilação de DNA , Sequenciamento por Nanoporos , Sequenciamento por Nanoporos/métodos , Humanos , Software , Análise de Sequência de DNA/métodos , Redes Neurais de Computação

2.

NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads.

Hu, Jiang; Wang, Zhuo; Sun, Zongyi; Hu, Benxia; Ayoola, Adeola Oluwakemi; Liang, Fan; Li, Jingjing; Sandoval, José R; Cooper, David N; Ye, Kai; Ruan, Jue; Xiao, Chuan-Le; Wang, Depeng; Wu, Dong-Dong; Wang, Sheng.

Genome Biol ; 25(1): 107, 2024 04 26.

Artigo em Inglês | MEDLINE | ID: mdl-38671502

RESUMO

Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.

Assuntos

Variações do Número de Cópias de DNA , Genoma Humano , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Sequenciamento por Nanoporos/métodos , Análise de Sequência de DNA/métodos , Genômica/métodos

3.

DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing.

Ni, Peng; Nie, Fan; Zhong, Zeyu; Xu, Jinrui; Huang, Neng; Zhang, Jun; Zhao, Haochen; Zou, You; Huang, Yuanfeng; Li, Jinchen; Xiao, Chuan-Le; Luo, Feng; Wang, Jianxin.

Nat Commun ; 14(1): 4054, 2023 07 08.

Artigo em Inglês | MEDLINE | ID: mdl-37422489

RESUMO

Long single-molecular sequencing technologies, such as PacBio circular consensus sequencing (CCS) and nanopore sequencing, are advantageous in detecting DNA 5-methylcytosine in CpGs (5mCpGs), especially in repetitive genomic regions. However, existing methods for detecting 5mCpGs using PacBio CCS are less accurate and robust. Here, we present ccsmeth, a deep-learning method to detect DNA 5mCpGs using CCS reads. We sequence polymerase-chain-reaction treated and M.SssI-methyltransferase treated DNA of one human sample using PacBio CCS for training ccsmeth. Using long (≥10 Kb) CCS reads, ccsmeth achieves 0.90 accuracy and 0.97 Area Under the Curve on 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves >0.90 correlations with bisulfite sequencing and nanopore sequencing using only 10× reads. Furthermore, we develop a Nextflow pipeline, ccsmethphase, to detect haplotype-aware methylation using CCS reads, and then sequence a Chinese family trio to validate it. ccsmeth and ccsmethphase can be robust and accurate tools for detecting DNA 5-methylcytosines.

Assuntos

5-Metilcitosina , DNA , Humanos , Consenso , DNA/genética , Análise de Sequência de DNA/métodos , Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos

4.

High-throughput and high-accuracy single-cell RNA isoform analysis using PacBio circular consensus sequencing.

Shi, Zhuo-Xing; Chen, Zhi-Chao; Zhong, Jia-Yong; Hu, Kun-Hua; Zheng, Ying-Feng; Chen, Ying; Xie, Shang-Qian; Bo, Xiao-Chen; Luo, Feng; Tang, Chong; Xiao, Chuan-Le; Liu, Yi-Zhi.

Nat Commun ; 14(1): 2631, 2023 05 06.

Artigo em Inglês | MEDLINE | ID: mdl-37149708

RESUMO

Although long-read single-cell RNA isoform sequencing (scISO-Seq) can reveal alternative RNA splicing in individual cells, it suffers from a low read throughput. Here, we introduce HIT-scISOseq, a method that removes most artifact cDNAs and concatenates multiple cDNAs for PacBio circular consensus sequencing (CCS) to achieve high-throughput and high-accuracy single-cell RNA isoform sequencing. HIT-scISOseq can yield >10 million high-accuracy long-reads in a single PacBio Sequel II SMRT Cell 8M. We also report the development of scISA-Tools that demultiplex HIT-scISOseq concatenated reads into single-cell cDNA reads with >99.99% accuracy and specificity. We apply HIT-scISOseq to characterize the transcriptomes of 3375 corneal limbus cells and reveal cell-type-specific isoform expression in them. HIT-scISOseq is a high-throughput, high-accuracy, technically accessible method and it can accelerate the burgeoning field of long-read single-cell transcriptomics.

Assuntos

Isoformas de RNA , RNA , Isoformas de RNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Consenso , Isoformas de Proteínas/genética , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA

5.

High-throughput Pore-C reveals the single-allele topology and cell type-specificity of 3D genome folding.

Zhong, Jia-Yong; Niu, Longjian; Lin, Zhuo-Bin; Bai, Xin; Chen, Ying; Luo, Feng; Hou, Chunhui; Xiao, Chuan-Le.

Nat Commun ; 14(1): 1250, 2023 03 06.

Artigo em Inglês | MEDLINE | ID: mdl-36878904

RESUMO

Canonical three-dimensional (3D) genome structures represent the ensemble average of pairwise chromatin interactions but not the single-allele topologies in populations of cells. Recently developed Pore-C can capture multiway chromatin contacts that reflect regional topologies of single chromosomes. By carrying out high-throughput Pore-C, we reveal extensive but regionally restricted clusters of single-allele topologies that aggregate into canonical 3D genome structures in two human cell types. We show that fragments in multi-contact reads generally coexist in the same TAD. In contrast, a concurrent significant proportion of multi-contact reads span multiple compartments of the same chromatin type over megabase distances. Synergistic chromatin looping between multiple sites in multi-contact reads is rare compared to pairwise interactions. Interestingly, the single-allele topology clusters are cell type-specific even inside highly conserved TADs in different types of cells. In summary, HiPore-C enables global characterization of single-allele topologies at an unprecedented depth to reveal elusive genome folding principles.

Assuntos

Cromatina , Humanos , Alelos , Cromatina/genética

6.

NanoSNP: a progressive and haplotype-aware SNP caller on low-coverage nanopore sequencing data.

Huang, Neng; Xu, Minghua; Nie, Fan; Ni, Peng; Xiao, Chuan-Le; Luo, Feng; Wang, Jianxin.

Bioinformatics ; 39(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-36548365

RESUMO

MOTIVATION: Oxford Nanopore sequencing has great potential and advantages in population-scale studies. Due to the cost of sequencing, the depth of whole-genome sequencing for per individual sample must be small. However, the existing single nucleotide polymorphism (SNP) callers are aimed at high-coverage Nanopore sequencing reads. Detecting the SNP variants on low-coverage Nanopore sequencing data is still a challenging problem. RESULTS: We developed a novel deep learning-based SNP calling method, NanoSNP, to identify the SNP sites (excluding short indels) based on low-coverage Nanopore sequencing reads. In this method, we design a multi-step, multi-scale and haplotype-aware SNP detection pipeline. First, the pileup model in NanoSNP utilizes the naive pileup feature to predict a subset of SNP sites with a Bi-long short-term memory (LSTM) network. These SNP sites are phased and used to divide the low-coverage Nanopore reads into different haplotypes. Finally, the long-range haplotype feature and short-range pileup feature are extracted from each haplotype. The haplotype model combines two features and predicts the genotype for the candidate site using a Bi-LSTM network. To evaluate the performance of NanoSNP, we compared NanoSNP with Clair, Clair3, Pepper-DeepVariant and NanoCaller on the low-coverage (â¼16×) Nanopore sequencing reads. We also performed cross-genome testing on six human genomes HG002-HG007, respectively. Comprehensive experiments demonstrate that NanoSNP outperforms Clair, Pepper-DeepVariant and NanoCaller in identifying SNPs on low-coverage Nanopore sequencing data, including the difficult-to-map regions and major histocompatibility complex regions in the human genome. NanoSNP is comparable to Clair3 when the coverage exceeds 16×. AVAILABILITY AND IMPLEMENTATION: https://github.com/huangnengCSU/NanoSNP.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Sequenciamento por Nanoporos , Nanoporos , Humanos , Haplótipos , Software , Polimorfismo de Nucleotídeo Único , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos

7.

Editorial: Biomedical Data Visualization: Methods and Applications.

Wu, Tianzhi; Xiao, Chuan-Le; Lam, Tommy Tsan-Yuk; Yu, Guangchuang.

Front Genet ; 13: 890775, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35571011

8.

Genome-wide detection of cytosine methylations in plant from Nanopore data using deep learning.

Ni, Peng; Huang, Neng; Nie, Fan; Zhang, Jun; Zhang, Zhi; Wu, Bo; Bai, Lu; Liu, Wende; Xiao, Chuan-Le; Luo, Feng; Wang, Jianxin.

Nat Commun ; 12(1): 5976, 2021 10 13.

Artigo em Inglês | MEDLINE | ID: mdl-34645826

RESUMO

In plants, cytosine DNA methylations (5mCs) can happen in three sequence contexts as CpG, CHG, and CHH (where H = A, C, or T), which play different roles in the regulation of biological processes. Although long Nanopore reads are advantageous in the detection of 5mCs comparing to short-read bisulfite sequencing, existing methods can only detect 5mCs in the CpG context, which limits their application in plants. Here, we develop DeepSignal-plant, a deep learning tool to detect genome-wide 5mCs of all three contexts in plants from Nanopore reads. We sequence Arabidopsis thaliana and Oryza sativa using both Nanopore and bisulfite sequencing. We develop a denoising process for training models, which enables DeepSignal-plant to achieve high correlations with bisulfite sequencing for 5mC detection in all three contexts. Furthermore, DeepSignal-plant can profile more 5mC sites, which will help to provide a more complete understanding of epigenetic mechanisms of different biological processes.

Assuntos

Arabidopsis/genética , Citosina/metabolismo , DNA de Plantas/genética , Epigênese Genética , Genoma de Planta , Oryza/genética , Arabidopsis/metabolismo , Ilhas de CpG , Metilação de DNA , DNA de Plantas/metabolismo , Aprendizado Profundo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Oryza/metabolismo , Análise de Sequência de DNA , Sulfitos/química

9.

SCSit: A high-efficiency preprocessing tool for single-cell sequencing data from SPLiT-seq.

Luan, Mei-Wei; Lin, Jia-Lun; Wang, Ye-Fan; Liu, Yu-Xiao; Xiao, Chuan-Le; Wu, Rongling; Xie, Shang-Qian.

Comput Struct Biotechnol J ; 19: 4574-4580, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34471500

RESUMO

SPLiT-seq provides a low-cost platform to generate single-cell data by labeling the cellular origin of RNA through four rounds of combinatorial barcoding. However, an automatic and rapid method for preprocessing and classifying single-cell sequencing (SCS) data from SPLiT-seq, which directly identified and labeled combinatorial barcoding reads and distinguished special cell sequencing data, is currently lacking. Here, we develop a high-efficiency preprocessing tool for single-cell sequencing data from SPLiT-seq (SCSit), which can directly identify combinatorial barcodes and UMI of cell types and obtain more labeled reads, and remarkably enhance the retained data from SCS due to the exact alignment of insertion and deletion. Compared with the original method used in SPLiT-seq, the consistency of identified reads from SCSit increases to 97%, and mapped reads are twice than the original. Furthermore, the runtime of SCSit is less than 10% of the original. It can accurately and rapidly analyze SPLiT-seq raw data and obtain labeled reads, as well as effectively improve the single-cell data from SPLiT-seq platform. The data and source of SCSit are available on the GitHub website https://github.com/shang-qian/SCSit.

10.

Genomic Elucidation of a COVID-19 Resurgence and Local Transmission of SARS-CoV-2 in Guangzhou, China.

Jia, Hong-Ling; Li, Peng; Liu, Hong-Jie; Zhong, Jia-Yong; Qin, Peng-Zhe; Su, Wen-Zhe; Zheng, Ying-Feng; Li, Kui-Biao; Zeng, Qing; Li, Jin-Hui; Li, Li-Zhong; Cao, Lan; Wu, Ji-Bin; Chen, Yi-Yun; Jia, Lei-Li; Song, Hong-Bin; Zhang, Qi-Wei; Yang, Guang; Jing, Chun-Xia; Bo, Xiao-Chen; Zhang, Zhou-Bin; Di, Biao; Xiao, Chuan-Le; Ni, Ming.

J Clin Microbiol ; 59(8): e0007921, 2021 07 19.

Artigo em Inglês | MEDLINE | ID: mdl-33952598

RESUMO

While China experienced a peak and decline in coronavirus disease 2019 (COVID-19) cases at the start of 2020, regional outbreaks continuously emerged in subsequent months. Resurgences of COVID-19 have also been observed in many other countries. In Guangzhou, China, a small outbreak, involving less than 100 residents, emerged in March and April 2020, and comprehensive and near-real-time genomic surveillance of SARS-CoV-2 was conducted. When the numbers of confirmed cases among overseas travelers increased, public health measures were enhanced by shifting from self-quarantine to central quarantine and SARS-CoV-2 testing for all overseas travelers. In an analysis of 109 imported cases, we found diverse viral variants distributed in the global viral phylogeny, which were frequently shared within households but not among passengers on the same flight. In contrast to the viral diversity of imported cases, local transmission was predominately attributed to two specific variants imported from Africa, including local cases that reported no direct or indirect contact with imported cases. The introduction events of the virus were identified or deduced before the enhanced measures were taken. These results show the interventions were effective in containing the spread of SARS-CoV-2, and they rule out the possibility of cryptic transmission of viral variants from the first wave in January and February 2020. Our study provides evidence and emphasizes the importance of controls for overseas travelers in the context of the pandemic and exemplifies how viral genomic data can facilitate COVID-19 surveillance and inform public health mitigation strategies.

Assuntos

COVID-19 , SARS-CoV-2 , África , Teste para COVID-19 , China/epidemiologia , Genômica , Humanos

11.

Efficient assembly of nanopore reads via highly accurate and intact error correction.

Chen, Ying; Nie, Fan; Xie, Shang-Qian; Zheng, Ying-Feng; Dai, Qi; Bray, Thomas; Wang, Yao-Xin; Xing, Jian-Feng; Huang, Zhi-Jian; Wang, De-Peng; He, Li-Juan; Luo, Feng; Wang, Jian-Xin; Liu, Yi-Zhi; Xiao, Chuan-Le.

Nat Commun ; 12(1): 60, 2021 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-33397900

RESUMO

Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection.

Assuntos

Nanoporos , Análise de Sequência de DNA , Linhagem Celular , Cromossomos Humanos/genética , Genoma Humano , Humanos , Retinoblastoma/genética , Software

12.

DNA N6-Methyladenosine modification role in transmitted variations from genomic DNA to RNA in Herrania umbratica.

Luan, Mei-Wei; Chen, Wei; Xing, Jian-Feng; Xiao, Chuan-Le; Chen, Ying; Xie, Shang-Qian.

BMC Genomics ; 20(1): 508, 2019 Jun 18.

Artigo em Inglês | MEDLINE | ID: mdl-31215402

RESUMO

BACKGROUND: DNA methylation is an important epigenetic modification. Recently the developed single-molecule real-time (SMRT) sequencing technology provided an efficient way to detect DNA N6-methyladenine (6mA) modification that played an important role in epigenetic and positively regulated gene expression. In addition, the gene expression was also regulated by genetic variation. However, the relationship between DNA 6mA modification and variation is still unknown. RESULTS: We collected the SMRT long-reads DNA, Illumina short reads DNA and RNA datasets from the young leaves of Herrania umbratica, and used them to detect 35,654 6mA modification sites, 829,894 DNA variations and 60,672 RNA variations respectively, among which, there are 303 DNA variations and 19 RNA variations with 6mA modification, and 57,468 transmitted genetic variations from DNA to RNA. The results illustrated that the genes with 6mA modification were significant disadvantage to mutate than those genes without modification (p-value< 4.9e-08). And result from the linear regression model showed the 6mA densities of genes were associated with the transmitted variations type 0/1 to 1/1 (p-value < 0.001). CONCLUSIONS: The variations of DNA and RNA in genes with 6mA modification were significant less than those in unmodified genes. Furthermore, the variations in 6mA modified genes were easily transmitted from DNA to RNA, especially the transmitted variation from DNA heterozygote to RNA homozygote.

Assuntos

Adenosina/análogos & derivados , DNA de Plantas/genética , DNA de Plantas/metabolismo , Variação Genética/genética , Genoma de Planta/genética , Magnoliopsida/genética , RNA de Plantas/genética , Adenosina/metabolismo , DNA Intergênico/genética , DNA Intergênico/metabolismo , DNA de Plantas/química , Heterozigoto , Homozigoto , Magnoliopsida/metabolismo

13.

MDR: an integrative DNA N6-methyladenine and N4-methylcytosine modification database for Rosaceae.

Liu, Zhao-Yu; Xing, Jian-Feng; Chen, Wei; Luan, Mei-Wei; Xie, Rui; Huang, Jing; Xie, Shang-Qian; Xiao, Chuan-Le.

Hortic Res ; 6: 78, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31240103

RESUMO

Eukaryotic DNA methylation has been receiving increasing attention for its crucial epigenetic regulatory function. The recently developed single-molecule real-time (SMRT) sequencing technology provides an efficient way to detect DNA N6-methyladenine (6mA) and N4-methylcytosine (4mC) modifications at a single-nucleotide resolution. The family Rosaceae contains horticultural plants with a wide range of economic importance. However, little is currently known regarding the genome-wide distribution patterns and functions of 6mA and 4mC modifications in the Rosaceae. In this study, we present an integrated DNA 6mA and 4mC modification database for the Rosaceae (MDR, http://mdr.xieslab.org). MDR, the first repository for displaying and storing DNA 6mA and 4mC methylomes from SMRT sequencing data sets for Rosaceae, includes meta and statistical information, methylation densities, Gene Ontology enrichment analyses, and genome search and browse for methylated sites in NCBI. MDR provides important information regarding DNA 6mA and 4mC methylation and may help users better understand epigenetic modifications in the family Rosaceae.

14.

Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data.

Liu, Qian; Fang, Li; Yu, Guoliang; Wang, Depeng; Xiao, Chuan-Le; Wang, Kai.

Nat Commun ; 10(1): 2449, 2019 06 04.

Artigo em Inglês | MEDLINE | ID: mdl-31164644

RESUMO

DNA base modifications, such as C5-methylcytosine (5mC) and N6-methyldeoxyadenosine (6mA), are important types of epigenetic regulations. Short-read bisulfite sequencing and long-read PacBio sequencing have inherent limitations to detect DNA modifications. Here, using raw electric signals of Oxford Nanopore long-read sequencing data, we design DeepMod, a bidirectional recurrent neural network (RNN) with long short-term memory (LSTM) to detect DNA modifications. We sequence a human genome HX1 and a Chlamydomonas reinhardtii genome using Nanopore sequencing, and then evaluate DeepMod on three types of genomes (Escherichia coli, Chlamydomonas reinhardtii and human genomes). For 5mC detection, DeepMod achieves average precision up to 0.99 for both synthetically introduced and naturally occurring modifications. For 6mA detection, DeepMod achieves ~0.9 average precision on Escherichia coli data, and have improved performance than existing methods on Chlamydomonas reinhardtii data. In conclusion, DeepMod performs well for genome-scale detection of DNA modifications and will facilitate epigenetic analysis on diverse species.

Assuntos

Chlamydomonas reinhardtii/genética , Metilação de DNA , Escherichia coli/genética , Genoma Bacteriano/genética , Genoma Humano/genética , Genoma de Planta/genética , Redes Neurais de Computação , Bases de Dados de Ácidos Nucleicos , Epigênese Genética , Humanos , Nanoporos

15.

DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning.

Ni, Peng; Huang, Neng; Zhang, Zhi; Wang, De-Peng; Liang, Fan; Miao, Yu; Xiao, Chuan-Le; Luo, Feng; Wang, Jianxin.

Bioinformatics ; 35(22): 4586-4595, 2019 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-30994904

RESUMO

MOTIVATION: The Oxford Nanopore sequencing enables to directly detect methylation states of bases in DNA from reads without extra laboratory techniques. Novel computational methods are required to improve the accuracy and robustness of DNA methylation state prediction using Nanopore reads. RESULTS: In this study, we develop DeepSignal, a deep learning method to detect DNA methylation states from Nanopore sequencing reads. Testing on Nanopore reads of Homo sapiens (H. sapiens), Escherichia coli (E. coli) and pUC19 shows that DeepSignal can achieve higher performance at both read level and genome level on detecting 6 mA and 5mC methylation states comparing to previous hidden Markov model (HMM) based methods. DeepSignal achieves similar performance cross different DNA methylation bases, different DNA methylation motifs and both singleton and mixed DNA CpG. Moreover, DeepSignal requires much lower coverage than those required by HMM and statistics based methods. DeepSignal can achieve 90% above accuracy for detecting 5mC and 6 mA using only 2× coverage of reads. Furthermore, for DNA CpG methylation state prediction, DeepSignal achieves 90% correlation with bisulfite sequencing using just 20× coverage of reads, which is much better than HMM based methods. Especially, DeepSignal can predict methylation states of 5% more DNA CpGs that previously cannot be predicted by bisulfite sequencing. DeepSignal can be a robust and accurate method for detecting methylation states of DNA bases. AVAILABILITY AND IMPLEMENTATION: DeepSignal is publicly available at https://github.com/bioinfomaticsCSU/deepsignal. SUPPLEMENTARY INFORMATION: Supplementary data are available at bioinformatics online.

Assuntos

Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Aprendizado Profundo , Escherichia coli , Humanos , Sequenciamento por Nanoporos , Análise de Sequência de DNA

16.

N ⁶-Methyladenine DNA Modification in the Woodland Strawberry (Fragaria vesca) Genome Reveals a Positive Relationship With Gene Transcription.

Xie, Shang-Qian; Xing, Jian-Feng; Zhang, Xiao-Ming; Liu, Zhao-Yu; Luan, Mei-Wei; Zhu, Jie; Ling, Peng; Xiao, Chuan-Le; Song, Xi-Qiang; Zheng, Jun; Chen, Ying.

Front Genet ; 10: 1288, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31998359

RESUMO

N 6-methyladenine (6mA) DNA modification has been detected in several eukaryotic organisms, where it plays important roles in gene regulation and epigenetic memory maintenance. However, the genome-wide distribution patterns and potential functions of 6mA DNA modification in woodland strawberry (Fragaria vesca) remain largely unknown. Here, we examined the 6mA landscape in the F. vesca genome by adopting single-molecule real-time sequencing technology and found that 6mA modification sites were broadly distributed across the woodland strawberry genome. The pattern of 6mA distribution in the long non-coding RNA was significantly different from that in protein-coding genes. The 6mA modification influenced the gene transcription and was positively associated with gene expression, which was validated by computational and experimental analyses. Our study provides new insights into the DNA methylation in F. vesca.

17.

ISOdb: A Comprehensive Database of Full-Length Isoforms Generated by Iso-Seq.

Xie, Shang-Qian; Han, Yue; Chen, Xiao-Zhou; Cao, Tai-Yu; Ji, Kai-Kai; Zhu, Jie; Ling, Peng; Xiao, Chuan-Le.

Int J Genomics ; 2018: 9207637, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30581839

RESUMO

The accurate landscape of transcript isoforms plays an important role in the understanding of gene function and gene regulation. However, building complete transcripts is very challenging for short reads generated using next-generation sequencing. Fortunately, isoform sequencing (Iso-Seq) using single-molecule sequencing technologies, such as PacBio SMRT, provides long reads spanning entire transcript isoforms which do not require assembly. Therefore, we have developed ISOdb, a comprehensive resource database for hosting and carrying out an in-depth analysis of Iso-Seq datasets and visualising the full-length transcript isoforms. The current version of ISOdb has collected 93 publicly available Iso-Seq samples from eight species and presents the samples in two levels: (1) sample level, including metainformation, long read distribution, isoform numbers, and alternative splicing (AS) events of each sample; (2) gene level, including the total isoforms, novel isoform number, novel AS number, and isoform visualisation of each gene. In addition, ISOdb provides a user interface in the website for uploading sample information to facilitate the collection and analysis of researchers' datasets. Currently, ISOdb is the first repository that offers comprehensive resources and convenient public access for hosting, analysing, and visualising Iso-Seq data, which is freely available.

18.

N6-Methyladenine DNA modification in Xanthomonas oryzae pv. oryzicola genome.

Xiao, Chuan-Le; Xie, Shang-Qian; Xie, Qing-Biao; Liu, Zhao-Yu; Xing, Jian-Feng; Ji, Kai-Kai; Tao, Jun; Dai, Liang-Ying; Luo, Feng.

Sci Rep ; 8(1): 16272, 2018 11 02.

Artigo em Inglês | MEDLINE | ID: mdl-30389999

RESUMO

DNA N6-methyladenine (6mA) modifications expand the information capacity of DNA and have long been known to exist in bacterial genomes. Xanthomonas oryzae pv. Oryzicola (Xoc) is the causative agent of bacterial leaf streak, an emerging and destructive disease in rice worldwide. However, the genome-wide distribution patterns and potential functions of 6mA in Xoc are largely unknown. In this study, we analyzed the levels and global distribution patterns of 6mA modification in genomic DNA of seven Xoc strains (BLS256, BLS279, CFBP2286, CFBP7331, CFBP7341, L8 and RS105). The 6mA modification was found to be widely distributed across the seven Xoc genomes, accounting for percent of 3.80, 3.10, 3.70, 4.20, 3.40, 2.10, and 3.10 of the total adenines in BLS256, BLS279, CFBP2286, CFBP7331, CFBP7341, L8, and RS105, respectively. Notably, more than 82% of 6mA sites were located within gene bodies in all seven strains. Two specific motifs for 6 mA modification, ARGT and AVCG, were prevalent in all seven strains. Comparison of putative DNA methylation motifs from the seven strains reveals that Xoc have a specific DNA methylation system. Furthermore, the 6 mA modification of rpfC dramatically decreased during Xoc infection indicates the important role for Xoc adaption to environment.

Assuntos

Adenina/análogos & derivados , Metilação de DNA/genética , DNA Bacteriano/metabolismo , Regulação Bacteriana da Expressão Gênica , Xanthomonas/genética , Adenina/metabolismo , Proteínas de Bactérias/genética , Genes Bacterianos/genética , Interações Hospedeiro-Patógeno/genética , Oryza/microbiologia , Doenças das Plantas/microbiologia , Folhas de Planta/microbiologia , Virulência/genética , Xanthomonas/patogenicidade

19.

N⁶-Methyladenine DNA Modification in the Human Genome.

Xiao, Chuan-Le; Zhu, Song; He, Minghui; Chen, De; Zhang, Qian; Chen, Ying; Yu, Guoliang; Liu, Jinbao; Xie, Shang-Qian; Luo, Feng; Liang, Zhe; Wang, De-Peng; Bo, Xiao-Chen; Gu, Xiao-Feng; Wang, Kai; Yan, Guang-Rong.

Mol Cell ; 71(2): 306-318.e7, 2018 07 19.

Artigo em Inglês | MEDLINE | ID: mdl-30017583

RESUMO

DNA N6-methyladenine (6mA) modification is the most prevalent DNA modification in prokaryotes, but whether it exists in human cells and whether it plays a role in human diseases remain enigmatic. Here, we showed that 6mA is extensively present in the human genome, and we cataloged 881,240 6mA sites accounting for â¼0.051% of the total adenines. [G/C]AGG[C/T] was the most significantly associated motif with 6mA modification. 6mA sites were enriched in the coding regions and mark actively transcribed genes in human cells. DNA 6mA and N6-demethyladenine modification in the human genome were mediated by methyltransferase N6AMT1 and demethylase ALKBH1, respectively. The abundance of 6mA was significantly lower in cancers, accompanied by decreased N6AMT1 and increased ALKBH1 levels, and downregulation of 6mA modification levels promoted tumorigenesis. Collectively, our results demonstrate that DNA 6mA modification is extensively present in human cells and the decrease of genomic DNA 6mA promotes human tumorigenesis.

Assuntos

Adenina/análogos & derivados , Homólogo AlkB 1 da Histona H2a Dioxigenase/metabolismo , Genoma Humano , DNA Metiltransferases Sítio Específica (Adenina-Específica)/metabolismo , Adenina/metabolismo , Homólogo AlkB 1 da Histona H2a Dioxigenase/genética , Animais , Carcinogênese/genética , DNA/genética , Metilação de DNA , Xenoenxertos , Humanos , Camundongos , Camundongos Nus , DNA Metiltransferases Sítio Específica (Adenina-Específica)/genética

20.

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.

Xiao, Chuan-Le; Chen, Ying; Xie, Shang-Qian; Chen, Kai-Ning; Wang, Yan; Han, Yue; Luo, Feng; Xie, Zhi.

Nat Methods ; 14(11): 1072-1074, 2017 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-28945707

RESUMO

We present a tool that combines fast mapping, error correction, and de novo assembly (MECAT; accessible at https://github.com/xiaochuanle/MECAT) for processing single-molecule sequencing (SMS) reads. MECAT's computing efficiency is superior to that of current tools, while the results MECAT produces are comparable or improved. MECAT enables reference mapping or de novo assembly of large genomes using SMS reads on a single computer.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA