Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 599
Filtrar
1.
Science ; 385(6711): 892-898, 2024 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-39172826

RESUMO

Single-molecule techniques are ideally poised to characterize complex dynamics but are typically limited to investigating a small number of different samples. However, a large sequence or chemical space often needs to be explored to derive a comprehensive understanding of complex biological processes. Here we describe multiplexed single-molecule characterization at the library scale (MUSCLE), a method that combines single-molecule fluorescence microscopy with next-generation sequencing to enable highly multiplexed observations of complex dynamics. We comprehensively profiled the sequence dependence of DNA hairpin properties and Cas9-induced target DNA unwinding-rewinding dynamics. The ability to explore a large sequence space for Cas9 allowed us to identify a number of target sequences with unexpected behaviors. We envision that MUSCLE will enable the mechanistic exploration of many fundamental biological processes.


Assuntos
DNA , Sequenciamento de Nucleotídeos em Larga Escala , Microscopia de Fluorescência , Imagem Individual de Molécula , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Imagem Individual de Molécula/métodos , DNA/química , DNA/genética , Microscopia de Fluorescência/métodos , Proteína 9 Associada à CRISPR , Análise de Sequência de DNA/métodos , Biblioteca Gênica , Sistemas CRISPR-Cas
2.
Sci Rep ; 14(1): 18650, 2024 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-39134627

RESUMO

Exposure to ionizing radiation can induce genetic aberrations via unrepaired DNA strand breaks. To investigate quantitatively the dose-effect relationship at the molecular level, we irradiated dry pBR322 plasmid DNA with 3 MeV protons and assessed fragmentation yields at different radiation doses using long-read sequencing from Oxford Nanopore Technologies. This technology applied to a reference DNA model revealed dose-dependent fragmentation, as evidenced by read length distributions, showing no discernible radiation sensitivity in specific genetic sequences. In addition, we propose a method for directly measuring the single-strand break (SSB) yield. Furthermore, through a comparative study with a collection of previous works on dry DNA irradiation, we show that the irradiation protocol leads to biases in the definition of ionizing sources. We support this scenario by discussing the size distributions of nanopore sequencing reads in the light of Geant4 and Geant4-DNA simulation toolkit predictions. We show that integrating long-read sequencing technologies with advanced Monte Carlo simulations paves a promising path toward advancing our comprehension and prediction of radiation-induced DNA fragmentation.


Assuntos
Fragmentação do DNA , Método de Monte Carlo , Plasmídeos , Plasmídeos/genética , Fragmentação do DNA/efeitos da radiação , Relação Dose-Resposta à Radiação , Análise de Sequência de DNA/métodos , Quebras de DNA de Cadeia Simples/efeitos da radiação , DNA/genética
3.
Nat Commun ; 15(1): 6956, 2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39138168

RESUMO

Structural variants (SVs) significantly contribute to human genome diversity and play a crucial role in precision medicine. Although advancements in single-molecule long-read sequencing offer a groundbreaking resource for SV detection, identifying SV breakpoints and sequences accurately and robustly remains challenging. We introduce VolcanoSV, an innovative hybrid SV detection pipeline that utilizes both a reference genome and local de novo assembly to generate a phased diploid assembly. VolcanoSV uses phased SNPs and unique k-mer similarity analysis, enabling precise haplotype-resolved SV discovery. VolcanoSV is adept at constructing comprehensive genetic maps encompassing SNPs, small indels, and all types of SVs, making it well-suited for human genomics studies. Our extensive experiments demonstrate that VolcanoSV surpasses state-of-the-art assembly-based tools in the detection of insertion and deletion SVs, exhibiting superior recall, precision, F1 scores, and genotype accuracy across a diverse range of datasets, including low-coverage (10x) datasets. VolcanoSV outperforms assembly-based tools in the identification of complex SVs, including translocations, duplications, and inversions, in both simulated and real cancer data. Moreover, VolcanoSV is robust to various evaluation parameters and accurately identifies breakpoints and SV sequences.


Assuntos
Diploide , Genoma Humano , Variação Estrutural do Genoma , Polimorfismo de Nucleotídeo Único , Humanos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Haplótipos
9.
HLA ; 104(2): e15654, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39149758

RESUMO

Full genomic sequence shows HLA-G*01:19 differs from HLA-G*01:04:01:01 only at position 99 in exon 2.


Assuntos
Alelos , Éxons , Antígenos HLA-G , Humanos , Sequência de Bases , Teste de Histocompatibilidade , Antígenos HLA-G/genética , Análise de Sequência de DNA/métodos
10.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39177264

RESUMO

Recent nanopore sequencing system (R10.4) has enhanced base calling accuracy and is being increasingly utilized for detecting CpG methylation state. However, the robustness and universality of the methylation calling model in officially supplied Dorado remains poorly tested. In this study, we obtained heterogeneous datasets from human and plant sources to carry out comprehensive evaluations, which showed that Dorado performed significantly different across datasets. We therefore developed deep neural networks and implemented several optimizations in training a new model called DeepBAM. DeepBAM achieved superior and more stable performances compared with Dorado, including higher area under the ROC curves (98.47% on average and up to 7.36% improvement) and F1 scores (94.97% on average and up to 16.24% improvement) across the datasets. DeepBAM-based whole genome methylation frequencies have achieved >0.95 correlations with BS-seq on four of five datasets, outperforming Dorado in all instances. It enables unraveling allele-specific methylation patterns, including regions of transposable elements. The enhanced performance of DeepBAM paves the way for broader applications of nanopore sequencing in CpG methylation studies.


Assuntos
Ilhas de CpG , Metilação de DNA , Sequenciamento por Nanoporos , Sequenciamento por Nanoporos/métodos , Humanos , Software , Análise de Sequência de DNA/métodos , Redes Neurais de Computação
11.
Microbiologyopen ; 13(4): e1432, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39166362

RESUMO

The long-read sequencing platform MinION, developed by Oxford Nanopore Technologies, enables the sequencing of bacterial genomes in resource-limited settings, such as field conditions or low- and middle-income countries. For this purpose, protocols for extracting high-molecular-weight DNA using nonhazardous, inexpensive reagents and equipment are needed, and some methods have been developed for gram-negative bacteria. However, we found that without modification, these protocols are unsuitable for gram-positive Streptococcus spp., a major threat to fish farming and food security in low- and middle-income countries. Multiple approaches were evaluated, and the most effective was an extraction method using lysozyme, sodium dodecyl sulfate, and proteinase K for lysis of bacterial cells and magnetic beads for DNA recovery. We optimized the method to consistently achieve sufficient yields of pure high-molecular-weight DNA with minimal reagents and time and developed a version of the protocol which can be performed without a centrifuge or electrical power. The suitability of the method was verified by MinION sequencing and assembly of 12 genomes of epidemiologically diverse fish-pathogenic Streptococcus iniae and Streptococcus agalactiae isolates. The combination of effective high-molecular-weight DNA extraction and MinION sequencing enabled the discovery of a naturally occurring 15 kb low-copy number mobilizable plasmid in S. iniae, which we name pSI1. We expect that our resource-limited settings-adapted protocol for high-molecular-weight DNA extraction could be implemented successfully for similarly recalcitrant-to-lysis gram-positive bacteria, and it represents a method of choice for MinION-based disease diagnostics in low- and middle-income countries.


Assuntos
DNA Bacteriano , Sequenciamento por Nanoporos , Streptococcus , Streptococcus/genética , Streptococcus/isolamento & purificação , Streptococcus/classificação , DNA Bacteriano/genética , Sequenciamento por Nanoporos/métodos , Animais , Genoma Bacteriano/genética , Peso Molecular , Análise de Sequência de DNA/métodos , Peixes/microbiologia , Doenças dos Peixes/microbiologia , Infecções Estreptocócicas/microbiologia , Região de Recursos Limitados
12.
Nat Commun ; 15(1): 6852, 2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39127768

RESUMO

Cis-regulatory elements (CREs) are pivotal in orchestrating gene expression throughout diverse biological systems. Accurate identification and in-depth characterization of functional CREs are crucial for decoding gene regulation networks during cellular processes. In this study, we develop Kethoxal-Assisted Single-stranded DNA Assay for Transposase-Accessible Chromatin with Sequencing (KAS-ATAC-seq) to quantitatively analyze the transcriptional activity of CREs. A main advantage of KAS-ATAC-seq lies in its precise measurement of ssDNA levels within both proximal and distal ATAC-seq peaks, enabling the identification of transcriptional regulatory sequences. This feature is particularly adept at defining Single-Stranded Transcribing Enhancers (SSTEs). SSTEs are highly enriched with nascent RNAs and specific transcription factors (TFs) binding sites that define cellular identity. Moreover, KAS-ATAC-seq provides a detailed characterization and functional implications of various SSTE subtypes. Our analysis of CREs during mouse neural differentiation demonstrates that KAS-ATAC-seq can effectively identify immediate-early activated CREs in response to retinoic acid (RA) treatment. Our findings indicate that KAS-ATAC-seq provides more precise annotation of functional CREs in transcription. Future applications of KAS-ATAC-seq would help elucidate the intricate dynamics of gene regulation in diverse biological processes.


Assuntos
Fatores de Transcrição , Animais , Camundongos , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Transcrição Gênica , Elementos Facilitadores Genéticos/genética , Cromatina/metabolismo , Cromatina/genética , Sítios de Ligação , Humanos , DNA de Cadeia Simples/genética , DNA de Cadeia Simples/metabolismo , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Transposases/metabolismo , Transposases/genética , Elementos Reguladores de Transcrição , Tretinoína/farmacologia , Tretinoína/metabolismo , Regulação da Expressão Gênica , Diferenciação Celular/genética , Análise de Sequência de DNA/métodos , Sequências Reguladoras de Ácido Nucleico/genética
13.
Methods Mol Biol ; 2818: 3-22, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39126464

RESUMO

During meiosis, Spo11 generates DNA double-strand breaks to induce recombination, becoming covalently attached to the 5' ends on both sides of the break during this process. Such Spo11 "covalent complexes" are transient in wild-type cells, but accumulate in nuclease mutants unable to initiate repair. The CC-seq method presented here details how to map the location of these Spo11 complexes genome-wide with strand-specific nucleotide-resolution accuracy in synchronized Saccharomyces cerevisiae meiotic cells.


Assuntos
Quebras de DNA de Cadeia Dupla , Endodesoxirribonucleases , Meiose , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Endodesoxirribonucleases/metabolismo , Endodesoxirribonucleases/genética , Meiose/genética , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , DNA Fúngico/genética , DNA Fúngico/metabolismo , Análise de Sequência de DNA/métodos , Reparo do DNA
14.
BMC Genomics ; 25(1): 778, 2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39127634

RESUMO

BACKGROUND: DNA sequencing is a critical tool in modern biology. Over the last two decades, it has been revolutionized by the advent of massively parallel sequencing, leading to significant advances in the genome and transcriptome sequencing of various organisms. Nevertheless, challenges with accuracy, lack of competitive options and prohibitive costs associated with high throughput parallel short-read sequencing persist. RESULTS: Here, we conduct a comparative analysis using matched DNA and RNA short-reads assays between Element Biosciences' AVITI and Illumina's NextSeq 550 chemistries. Similar comparisons were evaluated for synthetic long-read sequencing for RNA and targeted single-cell transcripts between the AVITI and Illumina's NovaSeq 6000. For both DNA and RNA short-read applications, the study found that the AVITI produced significantly higher per sequence quality scores. For PCR-free DNA libraries, we observed an average 89.7% lower experimentally determined error rate when using the AVITI chemistry, compared to the NextSeq 550. For short-read RNA quantification, AVITI platform had an average of 32.5% lower error rate than that for NextSeq 550. With regards to synthetic long-read mRNA and targeted synthetic long read single cell mRNA sequencing, both platforms' respective chemistries performed comparably in quantification of genes and isoforms. The AVITI displayed a marginally lower error rate for long reads, with fewer chemistry-specific errors and a higher mutation detection rate. CONCLUSION: These results point to the potential of the AVITI platform as a competitive candidate in high-throughput short read sequencing analyses when juxtaposed with the Illumina NextSeq 550.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Humanos , Análise de Célula Única/métodos , Biblioteca Gênica
15.
Microbiome ; 12(1): 151, 2024 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-39143609

RESUMO

BACKGROUND: Metagenomic binning, the clustering of assembled contigs that belong to the same genome, is a crucial step for recovering metagenome-assembled genomes (MAGs). Contigs are linked by exploiting consistent signatures along a genome, such as read coverage patterns. Using coverage from multiple samples leads to higher-quality MAGs; however, standard pipelines require all-to-all read alignments for multiple samples to compute coverage, becoming a key computational bottleneck. RESULTS: We present fairy ( https://github.com/bluenote-1577/fairy ), an approximate coverage calculation method for metagenomic binning. Fairy is a fast k-mer-based alignment-free method. For multi-sample binning, fairy can be > 250 × faster than read alignment and accurate enough for binning. Fairy is compatible with several existing binners on host and non-host-associated datasets. Using MetaBAT2, fairy recovers 98.5 % of MAGs with > 50 % completeness and < 5 % contamination relative to alignment with BWA. Notably, multi-sample binning with fairy is always better than single-sample binning using BWA ( > 1.5 × more > 50 % complete MAGs on average) while still being faster. For a public sediment metagenome project, we demonstrate that multi-sample binning recovers higher quality Asgard archaea MAGs than single-sample binning and that fairy's results are indistinguishable from read alignment. CONCLUSIONS: Fairy is a new tool for approximately and quickly calculating multi-sample coverage for binning, resolving a computational bottleneck for metagenomics. Video Abstract.


Assuntos
Metagenoma , Metagenômica , Metagenômica/métodos , Software , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Archaea/genética , Archaea/classificação , Algoritmos
16.
PLoS Comput Biol ; 20(8): e1011854, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39093856

RESUMO

Single-cell ATAC-seq sequencing data (scATAC-seq) has been widely used to investigate chromatin accessibility on the single-cell level. One important application of scATAC-seq data analysis is differential chromatin accessibility (DA) analysis. However, the data characteristics of scATAC-seq such as excessive zeros and large variability of chromatin accessibility across cells impose a unique challenge for DA analysis. Existing statistical methods focus on detecting the mean difference of the chromatin accessible regions while overlooking the distribution difference. Motivated by real data exploration that distribution difference exists among cell types, we introduce a novel composite statistical test named "scaDA", which is based on zero-inflated negative binomial model (ZINB), for performing differential distribution analysis of chromatin accessibility by jointly testing the abundance, prevalence and dispersion simultaneously. Benefiting from both dispersion shrinkage and iterative refinement of mean and prevalence parameter estimates, scaDA demonstrates its superiority to both ZINB-based likelihood ratio tests and published methods by achieving the highest power and best FDR control in a comprehensive simulation study. In addition to demonstrating the highest power in three real sc-multiome data analyses, scaDA successfully identifies differentially accessible regions in microglia from sc-multiome data for an Alzheimer's disease (AD) study that are most enriched in GO terms related to neurogenesis and the clinical phenotype of AD, and AD-associated GWAS SNPs.


Assuntos
Cromatina , Análise de Célula Única , Cromatina/genética , Cromatina/metabolismo , Cromatina/química , Análise de Célula Única/métodos , Análise de Célula Única/estatística & dados numéricos , Humanos , Biologia Computacional/métodos , Doença de Alzheimer/genética , Modelos Estatísticos , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Simulação por Computador , Animais , Análise de Sequência de DNA/métodos , Algoritmos
17.
J Pharm Biomed Anal ; 249: 116397, 2024 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-39111245

RESUMO

We proposed a single-color fluorogenic DNA decoding sequencing method designed to improve sequencing accuracy, increase read length and throughput, as well as decrease scanning time. This method involves the incorporation of a mixture of four types of 3'-O-modified nucleotide reversible terminators into each reaction. Among them, two nucleotides are labeled with the same fluorophore, while the remaining two are unlabeled. Only one nucleotide can be extended in each reaction, and an encoding that partially defines base composition can be obtained. Through cyclic interrogation of a template twice with different nucleotide combinations, two sets of encodings are sequentially obtained, enabling the determination of the sequence. We demonstrate the feasibility of this method using established sequencing chemistry, achieving a cycle efficiency of approximately 99.5 %. Notably, this strategy exhibits remarkable efficacy in the detection and correction of sequencing errors, achieving a theoretical error rate of 0.00016 % at a sequencing depth of ×2, which is lower than Sanger sequencing. This method is theoretically compatible with the existing sequencing-by-synthesis (SBS) platforms, and the instrument is simpler, which may facilitate further reductions in sequencing costs, thereby broadening its applications in biology and medicine. Moreover, we demonstrate the capability to detect known mutation sites using information from only a single sequencing run. We validate this approach by accurately identifying a mutation site in the human mitochondrial DNA.


Assuntos
Corantes Fluorescentes , Mutação , Corantes Fluorescentes/química , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , DNA/genética , Genótipo , Técnicas de Genotipagem/métodos , Análise Mutacional de DNA/métodos , DNA Mitocondrial/genética
18.
Sci Data ; 11(1): 892, 2024 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-39152166

RESUMO

Next-generation sequencing (NGS) has revolutionized genomic research by enabling high-throughput, cost-effective genome and transcriptome sequencing accelerating personalized medicine for complex diseases, including cancer. Whole genome/transcriptome sequencing (WGS/WTS) provides comprehensive insights, while targeted sequencing is more cost-effective and sensitive. In comparison to short-read sequencing, which still dominates the field due to high speed and cost-effectiveness, long-read sequencing can overcome alignment limitations and better discriminate similar sequences from alternative transcripts or repetitive regions. Hybrid sequencing combines the best strengths of different technologies for a more comprehensive view of genomic/transcriptomic variations. Understanding each technology's strengths and limitations is critical for translating cutting-edge technologies into clinical applications. In this study, we sequenced DNA and RNA libraries of reference samples using various targeted DNA and RNA panels and the whole transcriptome on both short-read and long-read platforms. This study design enables a comprehensive analysis of sequencing technologies, targeting protocols, and library preparation methods. Our expanded profiling landscape establishes a reference point for assessing current sequencing technologies, facilitating informed decision-making in genomic research and precision medicine.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA-Seq , Análise de Sequência de DNA/métodos , Transcriptoma , Análise de Sequência de RNA , Medicina de Precisão
19.
BMC Bioinformatics ; 25(1): 267, 2024 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-39160480

RESUMO

BACKGROUND: The utilization of long reads for single nucleotide polymorphism (SNP) phasing has become popular, providing substantial support for research on human diseases and genetic studies in animals and plants. However, due to the complexity of the linkage relationships between SNP loci and sequencing errors in the reads, the recent methods still cannot yield satisfactory results. RESULTS: In this study, we present a graph-based algorithm, GCphase, which utilizes the minimum cut algorithm to perform phasing. First, based on alignment between long reads and the reference genome, GCphase filters out ambiguous SNP sites and useless read information. Second, GCphase constructs a graph in which a vertex represents alleles of an SNP locus and each edge represents the presence of read support; moreover, GCphase adopts a graph minimum-cut algorithm to phase the SNPs. Next, GCpahse uses two error correction steps to refine the phasing results obtained from the previous step, effectively reducing the error rate. Finally, GCphase obtains the phase block. GCphase was compared to three other methods, WhatsHap, HapCUT2, and LongPhase, on the Nanopore and PacBio long-read datasets. The code is available from https://github.com/baimawjy/GCphase . CONCLUSIONS: Experimental results show that GCphase under different sequencing depths of different data has the least number of switch errors and the highest accuracy compared with other methods.


Assuntos
Algoritmos , Polimorfismo de Nucleotídeo Único , Polimorfismo de Nucleotídeo Único/genética , Humanos , Análise de Sequência de DNA/métodos , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
20.
BMC Genomics ; 25(1): 789, 2024 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-39160478

RESUMO

BACKGROUND: Detecting very minor (< 1%) subpopulations using next-generation sequencing is a critical need for multiple applications, including the detection of drug resistant pathogens and somatic variant detection in oncology. A recently available sequencing approach termed 'sequencing by binding (SBB)' claims to have higher base calling accuracy data "out of the box." This paper evaluates the utility of using SBB for the detection of ultra-rare drug resistant subpopulations in Mycobacterium tuberculosis (Mtb) using a targeted amplicon assay and compares the performance of SBB to single molecule overlapping reads (SMOR) error corrected sequencing by synthesis (SBS) data. RESULTS: SBS displayed an elevated error rate when compared to SMOR error-corrected SBS and SBB techniques. SMOR error-corrected SBS and SBB technologies performed similarly within the linear range studies and error rate studies. CONCLUSIONS: With lower sequencing error rates within SBB sequencing, this technique looks promising for both targeted and unbiased whole genome sequencing, leading to the identification of minor (< 1%) subpopulations without the need for error correction methods.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Humanos , Sequenciamento Completo do Genoma/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA