Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
PLoS Biol ; 22(2): e3002505, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38363809

RESUMEN

Alternative splicing is an essential regulatory mechanism for development and pathogenesis. Through alternative splicing one gene can encode multiple isoforms and be translated into proteins with different functions. Therefore, this diversity is an important dimension to understand the molecular mechanism governing embryo development. Isoform expression in preimplantation embryos has been extensively investigated, leading to the discovery of new isoforms. However, the dynamics of isoform switching of different types of transcripts throughout the development remains unexplored. Here, using single-cell direct isoform sequencing in over 100 single blastomeres from the mouse oocyte to blastocyst stage, we quantified isoform expression and found that 3-prime partial transcripts lacking stop codons are highly accumulated in oocytes and zygotes. These transcripts are not transcription by-products and might play a role in maternal to zygote transition (MZT) process. Long-read sequencing also enabled us to determine the expression of transposable elements (TEs) at specific loci. In this way, we identified 3,894 TE loci that exhibited dynamic changes along the preimplantation development, likely regulating the expression of adjacent genes. Our work provides novel insights into the transcriptional regulation of early embryo development.


Asunto(s)
Elementos Transponibles de ADN , Desarrollo Embrionario , Femenino , Embarazo , Animales , Ratones , Elementos Transponibles de ADN/genética , Desarrollo Embrionario/genética , Isoformas de Proteínas/genética , Cigoto , Análisis de la Célula Individual
2.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39177264

RESUMEN

Recent nanopore sequencing system (R10.4) has enhanced base calling accuracy and is being increasingly utilized for detecting CpG methylation state. However, the robustness and universality of the methylation calling model in officially supplied Dorado remains poorly tested. In this study, we obtained heterogeneous datasets from human and plant sources to carry out comprehensive evaluations, which showed that Dorado performed significantly different across datasets. We therefore developed deep neural networks and implemented several optimizations in training a new model called DeepBAM. DeepBAM achieved superior and more stable performances compared with Dorado, including higher area under the ROC curves (98.47% on average and up to 7.36% improvement) and F1 scores (94.97% on average and up to 16.24% improvement) across the datasets. DeepBAM-based whole genome methylation frequencies have achieved >0.95 correlations with BS-seq on four of five datasets, outperforming Dorado in all instances. It enables unraveling allele-specific methylation patterns, including regions of transposable elements. The enhanced performance of DeepBAM paves the way for broader applications of nanopore sequencing in CpG methylation studies.


Asunto(s)
Islas de CpG , Metilación de ADN , Secuenciación de Nanoporos , Secuenciación de Nanoporos/métodos , Humanos , Programas Informáticos , Análisis de Secuencia de ADN/métodos , Redes Neurales de la Computación
3.
Mol Cell ; 71(2): 306-318.e7, 2018 07 19.
Artículo en Inglés | MEDLINE | ID: mdl-30017583

RESUMEN

DNA N6-methyladenine (6mA) modification is the most prevalent DNA modification in prokaryotes, but whether it exists in human cells and whether it plays a role in human diseases remain enigmatic. Here, we showed that 6mA is extensively present in the human genome, and we cataloged 881,240 6mA sites accounting for ∼0.051% of the total adenines. [G/C]AGG[C/T] was the most significantly associated motif with 6mA modification. 6mA sites were enriched in the coding regions and mark actively transcribed genes in human cells. DNA 6mA and N6-demethyladenine modification in the human genome were mediated by methyltransferase N6AMT1 and demethylase ALKBH1, respectively. The abundance of 6mA was significantly lower in cancers, accompanied by decreased N6AMT1 and increased ALKBH1 levels, and downregulation of 6mA modification levels promoted tumorigenesis. Collectively, our results demonstrate that DNA 6mA modification is extensively present in human cells and the decrease of genomic DNA 6mA promotes human tumorigenesis.


Asunto(s)
Adenina/análogos & derivados , Histona H2a Dioxigenasa, Homólogo 1 de AlkB/metabolismo , Genoma Humano , Metiltransferasa de ADN de Sitio Específico (Adenina Especifica)/metabolismo , Adenina/metabolismo , Histona H2a Dioxigenasa, Homólogo 1 de AlkB/genética , Animales , Carcinogénesis/genética , ADN/genética , Metilación de ADN , Xenoinjertos , Humanos , Ratones , Ratones Desnudos , Metiltransferasa de ADN de Sitio Específico (Adenina Especifica)/genética
4.
BMC Genomics ; 25(1): 336, 2024 Apr 03.
Artículo en Inglés | MEDLINE | ID: mdl-38570743

RESUMEN

The Asian tiger mosquito, Aedes albopictus, is a global invasive species, notorious for its role in transmitting dangerous human arboviruses such as dengue and Chikungunya. Although hematophagous behavior is repulsive, it is an effective strategy for mosquitoes like Aedes albopictus to transmit viruses, posing a significant risk to human health. However, the fragmented nature of the Ae. albopictus genome assembly has been a significant challenge, hindering in-depth biological and genetic studies of this mosquito. In this research, we have harnessed a variety of technologies and implemented a novel strategy to create a significantly improved genome assembly for Ae. albopictus, designated as AealbF3. This assembly boasts a completeness rate of up to 98.1%, and the duplication rate has been minimized to 1.2%. Furthermore, the fragmented contigs or scaffolds of AealbF3 have been organized into three distinct chromosomes, an arrangement corroborated through syntenic plot analysis, which compared the genetic structure of Ae. albopictus with that of Ae. aegypti. Additionally, the study has revealed a phylogenetic relationship suggesting that the PGANT3 gene is implicated in the hematophagous behavior of Ae. albopictus. This involvement was preliminarily substantiated through RNA interference (RNAi) techniques and behavioral experiment. In summary, the AealbF3 genome assembly will facilitate new biological insights and intervention strategies for combating this formidable vector of disease. The innovative assembly process employed in this study could also serve as a valuable template for the assembly of genomes in other insects characterized by high levels of heterozygosity.


Asunto(s)
Aedes , Mosquitos Vectores , Animales , Humanos , Mosquitos Vectores/genética , Filogenia , Conducta Alimentaria
5.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36548365

RESUMEN

MOTIVATION: Oxford Nanopore sequencing has great potential and advantages in population-scale studies. Due to the cost of sequencing, the depth of whole-genome sequencing for per individual sample must be small. However, the existing single nucleotide polymorphism (SNP) callers are aimed at high-coverage Nanopore sequencing reads. Detecting the SNP variants on low-coverage Nanopore sequencing data is still a challenging problem. RESULTS: We developed a novel deep learning-based SNP calling method, NanoSNP, to identify the SNP sites (excluding short indels) based on low-coverage Nanopore sequencing reads. In this method, we design a multi-step, multi-scale and haplotype-aware SNP detection pipeline. First, the pileup model in NanoSNP utilizes the naive pileup feature to predict a subset of SNP sites with a Bi-long short-term memory (LSTM) network. These SNP sites are phased and used to divide the low-coverage Nanopore reads into different haplotypes. Finally, the long-range haplotype feature and short-range pileup feature are extracted from each haplotype. The haplotype model combines two features and predicts the genotype for the candidate site using a Bi-LSTM network. To evaluate the performance of NanoSNP, we compared NanoSNP with Clair, Clair3, Pepper-DeepVariant and NanoCaller on the low-coverage (∼16×) Nanopore sequencing reads. We also performed cross-genome testing on six human genomes HG002-HG007, respectively. Comprehensive experiments demonstrate that NanoSNP outperforms Clair, Pepper-DeepVariant and NanoCaller in identifying SNPs on low-coverage Nanopore sequencing data, including the difficult-to-map regions and major histocompatibility complex regions in the human genome. NanoSNP is comparable to Clair3 when the coverage exceeds 16×. AVAILABILITY AND IMPLEMENTATION: https://github.com/huangnengCSU/NanoSNP.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nanoporos , Nanoporos , Humanos , Haplotipos , Programas Informáticos , Polimorfismo de Nucleótido Simple , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos
6.
J Clin Microbiol ; 59(8): e0007921, 2021 07 19.
Artículo en Inglés | MEDLINE | ID: mdl-33952598

RESUMEN

While China experienced a peak and decline in coronavirus disease 2019 (COVID-19) cases at the start of 2020, regional outbreaks continuously emerged in subsequent months. Resurgences of COVID-19 have also been observed in many other countries. In Guangzhou, China, a small outbreak, involving less than 100 residents, emerged in March and April 2020, and comprehensive and near-real-time genomic surveillance of SARS-CoV-2 was conducted. When the numbers of confirmed cases among overseas travelers increased, public health measures were enhanced by shifting from self-quarantine to central quarantine and SARS-CoV-2 testing for all overseas travelers. In an analysis of 109 imported cases, we found diverse viral variants distributed in the global viral phylogeny, which were frequently shared within households but not among passengers on the same flight. In contrast to the viral diversity of imported cases, local transmission was predominately attributed to two specific variants imported from Africa, including local cases that reported no direct or indirect contact with imported cases. The introduction events of the virus were identified or deduced before the enhanced measures were taken. These results show the interventions were effective in containing the spread of SARS-CoV-2, and they rule out the possibility of cryptic transmission of viral variants from the first wave in January and February 2020. Our study provides evidence and emphasizes the importance of controls for overseas travelers in the context of the pandemic and exemplifies how viral genomic data can facilitate COVID-19 surveillance and inform public health mitigation strategies.


Asunto(s)
COVID-19 , SARS-CoV-2 , África , Prueba de COVID-19 , China/epidemiología , Genómica , Humanos
7.
Nat Methods ; 14(11): 1072-1074, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28945707

RESUMEN

We present a tool that combines fast mapping, error correction, and de novo assembly (MECAT; accessible at https://github.com/xiaochuanle/MECAT) for processing single-molecule sequencing (SMS) reads. MECAT's computing efficiency is superior to that of current tools, while the results MECAT produces are comparable or improved. MECAT enables reference mapping or de novo assembly of large genomes using SMS reads on a single computer.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Programas Informáticos
8.
Bioinformatics ; 35(22): 4586-4595, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-30994904

RESUMEN

MOTIVATION: The Oxford Nanopore sequencing enables to directly detect methylation states of bases in DNA from reads without extra laboratory techniques. Novel computational methods are required to improve the accuracy and robustness of DNA methylation state prediction using Nanopore reads. RESULTS: In this study, we develop DeepSignal, a deep learning method to detect DNA methylation states from Nanopore sequencing reads. Testing on Nanopore reads of Homo sapiens (H. sapiens), Escherichia coli (E. coli) and pUC19 shows that DeepSignal can achieve higher performance at both read level and genome level on detecting 6 mA and 5mC methylation states comparing to previous hidden Markov model (HMM) based methods. DeepSignal achieves similar performance cross different DNA methylation bases, different DNA methylation motifs and both singleton and mixed DNA CpG. Moreover, DeepSignal requires much lower coverage than those required by HMM and statistics based methods. DeepSignal can achieve 90% above accuracy for detecting 5mC and 6 mA using only 2× coverage of reads. Furthermore, for DNA CpG methylation state prediction, DeepSignal achieves 90% correlation with bisulfite sequencing using just 20× coverage of reads, which is much better than HMM based methods. Especially, DeepSignal can predict methylation states of 5% more DNA CpGs that previously cannot be predicted by bisulfite sequencing. DeepSignal can be a robust and accurate method for detecting methylation states of DNA bases. AVAILABILITY AND IMPLEMENTATION: DeepSignal is publicly available at https://github.com/bioinfomaticsCSU/deepsignal. SUPPLEMENTARY INFORMATION: Supplementary data are available at bioinformatics online.


Asunto(s)
Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Aprendizaje Profundo , Escherichia coli , Humanos , Secuenciación de Nanoporos , Análisis de Secuencia de ADN
9.
BMC Genomics ; 20(1): 508, 2019 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-31215402

RESUMEN

BACKGROUND: DNA methylation is an important epigenetic modification. Recently the developed single-molecule real-time (SMRT) sequencing technology provided an efficient way to detect DNA N6-methyladenine (6mA) modification that played an important role in epigenetic and positively regulated gene expression. In addition, the gene expression was also regulated by genetic variation. However, the relationship between DNA 6mA modification and variation is still unknown. RESULTS: We collected the SMRT long-reads DNA, Illumina short reads DNA and RNA datasets from the young leaves of Herrania umbratica, and used them to detect 35,654 6mA modification sites, 829,894 DNA variations and 60,672 RNA variations respectively, among which, there are 303 DNA variations and 19 RNA variations with 6mA modification, and 57,468 transmitted genetic variations from DNA to RNA. The results illustrated that the genes with 6mA modification were significant disadvantage to mutate than those genes without modification (p-value< 4.9e-08). And result from the linear regression model showed the 6mA densities of genes were associated with the transmitted variations type 0/1 to 1/1 (p-value < 0.001). CONCLUSIONS: The variations of DNA and RNA in genes with 6mA modification were significant less than those in unmodified genes. Furthermore, the variations in 6mA modified genes were easily transmitted from DNA to RNA, especially the transmitted variation from DNA heterozygote to RNA homozygote.


Asunto(s)
Adenosina/análogos & derivados , ADN de Plantas/genética , ADN de Plantas/metabolismo , Variación Genética/genética , Genoma de Planta/genética , Magnoliopsida/genética , ARN de Planta/genética , Adenosina/metabolismo , ADN Intergénico/genética , ADN Intergénico/metabolismo , ADN de Plantas/química , Heterocigoto , Homocigoto , Magnoliopsida/metabolismo
10.
Nucleic Acids Res ; 45(D1): D85-D89, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27924023

RESUMEN

DNA methylation is an important type of epigenetic modifications, where 5- methylcytosine (5mC), 6-methyadenine (6mA) and 4-methylcytosine (4mC) are the most common types. Previous efforts have been largely focused on 5mC, providing invaluable insights into epigenetic regulation through DNA methylation. Recently developed single-molecule real-time (SMRT) sequencing technology provides a unique opportunity to detect the less studied DNA 6mA and 4mC modifications at single-nucleotide resolution. With a rapidly increased amount of SMRT sequencing data generated, there is an emerging demand to systematically explore DNA 6mA and 4mC modifications from these data sets. MethSMRT is the first resource hosting DNA 6mA and 4mC methylomes. All the data sets were processed using the same analysis pipeline with the same quality control. The current version of the database provides a platform to store, browse, search and download epigenome-wide methylation profiles of 156 species, including seven eukaryotes such as Arabidopsis, C. elegans, Drosophila, mouse and yeast, as well as 149 prokaryotes. It also offers a genome browser to visualize the methylation sites and related information such as single nucleotide polymorphisms (SNP) and genomic annotation. Furthermore, the database provides a quick summary of statistics of methylome of 6mA and 4mC and predicted methylation motifs for each species. MethSMRT is publicly available at http://sysbio.sysu.edu.cn/methsmrt/ without use restriction.


Asunto(s)
Adenina/análogos & derivados , Citosina/análogos & derivados , Metilación de ADN , Bases de Datos de Ácidos Nucleicos , Adenina/análisis , Animales , Citosina/análisis , ADN/química , Genoma , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN
11.
PLoS Genet ; 11(6): e1005302, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-26090660

RESUMEN

Translational systems can respond promptly to sudden environmental changes to provide rapid adaptations to environmental stress. Unlike the well-studied translational responses to oxidative stress in eukaryotic systems, little is known regarding how prokaryotes respond rapidly to oxidative stress in terms of translation. In this study, we measured protein synthesis from the entire Escherichia coli proteome and found that protein synthesis was severely slowed down under oxidative stress. With unchanged translation initiation, this slowdown was caused by decreased translation elongation speed. We further confirmed by tRNA sequencing and qRT-PCR that this deceleration was caused by a global, enzymatic downregulation of almost all tRNA species shortly after exposure to oxidative agents. Elevation in tRNA levels accelerated translation and protected E. coli against oxidative stress caused by hydrogen peroxide and the antibiotic ciprofloxacin. Our results showed that the global regulation of tRNAs mediates the rapid adjustment of the E. coli translation system for prompt adaptation to oxidative stress.


Asunto(s)
Adaptación Fisiológica , Escherichia coli/metabolismo , Estrés Oxidativo , ARN de Transferencia/metabolismo , Regulación hacia Abajo , Escherichia coli/genética , ARN de Transferencia/genética
12.
Mol Cell Proteomics ; 13(2): 503-19, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24200585

RESUMEN

Tetrahymena thermophila is a widely used unicellular eukaryotic model organism in biological research and contains more than 1000 protein kinases and phosphatases with specificity for Ser/Thr/Tyr residues. However, only a few dozen phosphorylation sites in T. thermophila are known, presenting a major obstacle to further understanding of the regulatory roles of reversible phosphorylation in this organism. In this study, we used high-accuracy mass-spectrometry-based proteomics to conduct global and site-specific phosphoproteome profiling of T. thermophila. In total, 1384 phosphopeptides and 2238 phosphorylation sites from 1008 T. thermophila proteins were identified through the combined use of peptide prefractionation, TiO2 enrichment, and two-dimensional LC-MS/MS analysis. The identified phosphoproteins are implicated in the regulation of various biological processes such as transport, gene expression, and mRNA metabolic process. Moreover, integrated analysis of the T. thermophila phosphoproteome and gene network revealed the potential biological functions of many previously unannotated proteins and predicted some putative kinase-substrate pairs. Our data provide the first global survey of phosphorylation in T. thermophila using a phosphoproteomic approach and suggest a wide-ranging regulatory scope of this modification. The provided dataset is a valuable resource for the future understanding of signaling pathways in this important model organism.


Asunto(s)
Fosfoproteínas/metabolismo , Fosfotransferasas/metabolismo , Mapas de Interacción de Proteínas , Procesamiento Proteico-Postraduccional , Proteoma/metabolismo , Tetrahymena thermophila/metabolismo , Secuencia de Aminoácidos , Humanos , Modelos Teóricos , Fosfoproteínas/análisis , Fosforilación , Plasmodium falciparum/metabolismo , Proteoma/análisis , Proteómica/métodos , Transducción de Señal , Trypanosoma brucei brucei/metabolismo
13.
Nucleic Acids Res ; 40(11): e83, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22379138

RESUMEN

The most crucial step in data processing from high-throughput sequencing applications is the accurate and sensitive alignment of the sequencing reads to reference genomes or transcriptomes. The accurate detection of insertions and deletions (indels) and errors introduced by the sequencing platform or by misreading of modified nucleotides is essential for the quantitative processing of the RNA-based sequencing (RNA-Seq) datasets and for the identification of genetic variations and modification patterns. We developed a new, fast and accurate algorithm for nucleic acid sequence analysis, FANSe, with adjustable mismatch allowance settings and ability to handle indels to accurately and quantitatively map millions of reads to small or large reference genomes. It is a seed-based algorithm which uses the whole read information for mapping and high sensitivity and low ambiguity are achieved by using short and non-overlapping reads. Furthermore, FANSe uses hotspot score to prioritize the processing of highly possible matches and implements modified Smith-Watermann refinement with reduced scoring matrix to accelerate the calculation without compromising its sensitivity. The FANSe algorithm stably processes datasets from various sequencing platforms, masked or unmasked and small or large genomes. It shows a remarkable coverage of low-abundance mRNAs which is important for quantitative processing of RNA-Seq datasets.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mapeo Cromosómico , Escherichia coli/genética , Genómica/métodos , Células HeLa , Humanos , Mutación INDEL , Análisis de Secuencia de ARN
14.
Nat Commun ; 15(1): 2964, 2024 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-38580638

RESUMEN

The high sequencing error rate has impeded the application of long noisy reads for diploid genome assembly. Most existing assemblers failed to generate high-quality phased assemblies using long noisy reads. Here, we present PECAT, a Phased Error Correction and Assembly Tool, for reconstructing diploid genomes from long noisy reads. We design a haplotype-aware error correction method that can retain heterozygote alleles while correcting sequencing errors. We combine a corrected read SNP caller and a raw read SNP caller to further improve the identification of inconsistent overlaps in the string graph. We use a grouping method to assign reads to different haplotype groups. PECAT efficiently assembles diploid genomes using Nanopore R9, PacBio CLR or Nanopore R10 reads only. PECAT generates more contiguous haplotype-specific contigs compared to other assemblers. Especially, PECAT achieves nearly haplotype-resolved assembly on B. taurus (Bison×Simmental) using Nanopore R9 reads and phase block NG50 with 59.4/58.0 Mb for HG002 using Nanopore R10 reads.


Asunto(s)
Diploidia , Nanoporos , Alelos , Haplotipos , Heterocigoto , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
15.
Prog Retin Eye Res ; 101: 101263, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38657834

RESUMEN

Retinal diseases encompass various conditions associated with sight-threatening immune responses and are leading causes of blindness worldwide. These diseases include age-related macular degeneration, diabetic retinopathy, glaucoma and uveitis. Emerging evidence underscores the vital role of the innate immune response in retinal diseases, beyond the previously emphasized T-cell-driven processes of the adaptive immune system. In particular, pyroptosis, a newly discovered programmed cell death process involving inflammasome formation, has been implicated in the loss of membrane integrity and the release of inflammatory cytokines. Several disease-relevant animal models have provided evidence that the formation of inflammasomes and the induction of pyroptosis in innate immune cells contribute to inflammation in various retinal diseases. In this review article, we summarize current knowledge about the innate immune system and pyroptosis in retinal diseases. We also provide insights into translational targeting approaches, including novel drugs countering pyroptosis, to improve the diagnosis and treatment of retinal diseases.


Asunto(s)
Inmunidad Innata , Inflamasomas , Piroptosis , Enfermedades de la Retina , Humanos , Piroptosis/fisiología , Inflamasomas/fisiología , Inflamasomas/metabolismo , Enfermedades de la Retina/metabolismo , Enfermedades de la Retina/tratamiento farmacológico , Animales , Inmunidad Innata/fisiología
16.
Genome Biol ; 25(1): 107, 2024 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-38671502

RESUMEN

Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.


Asunto(s)
Variaciones en el Número de Copia de ADN , Genoma Humano , Humanos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Secuenciación de Nanoporos/métodos , Análisis de Secuencia de ADN/métodos , Genómica/métodos
17.
J Proteome Res ; 12(1): 328-35, 2013 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-23163785

RESUMEN

Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .


Asunto(s)
Algoritmos , Péptidos , Proteínas , Espectrometría de Masas en Tándem , Bases de Datos de Proteínas , Internet , Modelos Estadísticos , Péptidos/genética , Péptidos/aislamiento & purificación , Probabilidad , Proteínas/genética , Proteínas/aislamiento & purificación , Programas Informáticos
18.
Nat Commun ; 14(1): 1250, 2023 03 06.
Artículo en Inglés | MEDLINE | ID: mdl-36878904

RESUMEN

Canonical three-dimensional (3D) genome structures represent the ensemble average of pairwise chromatin interactions but not the single-allele topologies in populations of cells. Recently developed Pore-C can capture multiway chromatin contacts that reflect regional topologies of single chromosomes. By carrying out high-throughput Pore-C, we reveal extensive but regionally restricted clusters of single-allele topologies that aggregate into canonical 3D genome structures in two human cell types. We show that fragments in multi-contact reads generally coexist in the same TAD. In contrast, a concurrent significant proportion of multi-contact reads span multiple compartments of the same chromatin type over megabase distances. Synergistic chromatin looping between multiple sites in multi-contact reads is rare compared to pairwise interactions. Interestingly, the single-allele topology clusters are cell type-specific even inside highly conserved TADs in different types of cells. In summary, HiPore-C enables global characterization of single-allele topologies at an unprecedented depth to reveal elusive genome folding principles.


Asunto(s)
Cromatina , Humanos , Alelos , Cromatina/genética
19.
Nat Commun ; 14(1): 4054, 2023 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-37422489

RESUMEN

Long single-molecular sequencing technologies, such as PacBio circular consensus sequencing (CCS) and nanopore sequencing, are advantageous in detecting DNA 5-methylcytosine in CpGs (5mCpGs), especially in repetitive genomic regions. However, existing methods for detecting 5mCpGs using PacBio CCS are less accurate and robust. Here, we present ccsmeth, a deep-learning method to detect DNA 5mCpGs using CCS reads. We sequence polymerase-chain-reaction treated and M.SssI-methyltransferase treated DNA of one human sample using PacBio CCS for training ccsmeth. Using long (≥10 Kb) CCS reads, ccsmeth achieves 0.90 accuracy and 0.97 Area Under the Curve on 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves >0.90 correlations with bisulfite sequencing and nanopore sequencing using only 10× reads. Furthermore, we develop a Nextflow pipeline, ccsmethphase, to detect haplotype-aware methylation using CCS reads, and then sequence a Chinese family trio to validate it. ccsmeth and ccsmethphase can be robust and accurate tools for detecting DNA 5-methylcytosines.


Asunto(s)
5-Metilcitosina , ADN , Humanos , Consenso , ADN/genética , Análisis de Secuencia de ADN/métodos , Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
20.
Nat Commun ; 14(1): 2631, 2023 05 06.
Artículo en Inglés | MEDLINE | ID: mdl-37149708

RESUMEN

Although long-read single-cell RNA isoform sequencing (scISO-Seq) can reveal alternative RNA splicing in individual cells, it suffers from a low read throughput. Here, we introduce HIT-scISOseq, a method that removes most artifact cDNAs and concatenates multiple cDNAs for PacBio circular consensus sequencing (CCS) to achieve high-throughput and high-accuracy single-cell RNA isoform sequencing. HIT-scISOseq can yield >10 million high-accuracy long-reads in a single PacBio Sequel II SMRT Cell 8M. We also report the development of scISA-Tools that demultiplex HIT-scISOseq concatenated reads into single-cell cDNA reads with >99.99% accuracy and specificity. We apply HIT-scISOseq to characterize the transcriptomes of 3375 corneal limbus cells and reveal cell-type-specific isoform expression in them. HIT-scISOseq is a high-throughput, high-accuracy, technically accessible method and it can accelerate the burgeoning field of long-read single-cell transcriptomics.


Asunto(s)
Isoformas de ARN , ARN , Isoformas de ARN/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Consenso , Isoformas de Proteínas/genética , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ARN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA