Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 94
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 119(5)2022 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-35074874

RESUMO

For nearly 50 years, the vision of using single molecules in circuits has been seen as providing the ultimate miniaturization of electronic chips. An advanced example of such a molecular electronics chip is presented here, with the important distinction that the molecular circuit elements play the role of general-purpose single-molecule sensors. The device consists of a semiconductor chip with a scalable array architecture. Each array element contains a synthetic molecular wire assembled to span nanoelectrodes in a current monitoring circuit. A central conjugation site is used to attach a single probe molecule that defines the target of the sensor. The chip digitizes the resulting picoamp-scale current-versus-time readout from each sensor element of the array at a rate of 1,000 frames per second. This provides detailed electrical signatures of the single-molecule interactions between the probe and targets present in a solution-phase test sample. This platform is used to measure the interaction kinetics of single molecules, without the use of labels, in a massively parallel fashion. To demonstrate broad applicability, examples are shown for probe molecule binding, including DNA oligos, aptamers, antibodies, and antigens, and the activity of enzymes relevant to diagnostics and sequencing, including a CRISPR/Cas enzyme binding a target DNA, and a DNA polymerase enzyme incorporating nucleotides as it copies a DNA template. All of these applications are accomplished with high sensitivity and resolution, on a manufacturable, scalable, all-electronic semiconductor chip device, thereby bringing the power of modern chips to these diverse areas of biosensing.


Assuntos
Técnicas Biossensoriais/instrumentação , Eletrônica/instrumentação , Ensaios Enzimáticos/instrumentação , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , DNA , Desenho de Equipamento/instrumentação , Cinética , Dispositivos Lab-On-A-Chip , Miniaturização/instrumentação , Nanotecnologia/instrumentação , Semicondutores
2.
Muscle Nerve ; 69(6): 708-718, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38558464

RESUMO

INTRODUCTION/AIMS: GNE myopathy is a rare autosomal recessive disorder caused by pathogenic variants in the GNE gene, which is essential for the sialic acid biosynthesis pathway. Although over 300 GNE variants have been reported, some patients remain undiagnosed with monoallelic pathogenic variants. This study aims to analyze the entire GNE genomic region to identify novel pathogenic variants. METHODS: Patients with clinically compatible GNE myopathy and monoallelic pathogenic variants in the GNE gene were enrolled. The other GNE pathogenic variant was verified using comprehensive methods including exon 2 quantitative polymerase chain reaction and nanopore long-read single-molecule sequencing (LRS). RESULTS: A deep intronic GNE variant, c.862+870C>T, was identified in nine patients from eight unrelated families. This variant generates a cryptic splice site, resulting in the activation of a novel pseudoexon between exons 5 and 6. It results in the insertion of an extra 146 nucleotides into the messengerRNA (mRNA), which is predicted to result in a truncated humanGNE1(hGNE1) protein. Peanut agglutinin(PNA) lectin staining of muscle tissues showed reduced sialylation of mucin O-glycans on sarcolemmal glycoproteins. Notably, a third of patients with the c.862+870C>T variant exhibited thrombocytopenia. A common core haplotype harboring the deep intronic GNE variant was found in all these patients. DISCUSSION: The transcript with pseudoexon activation potentially affects sialic acid biosynthesis via nonsense-mediated mRNA decay, or resulting in a truncated hGNE1 protein, which interferes with normal enzyme function. LRS is expected to be more frequently incorporated in genetic analysis given its efficacy in detecting hard-to-find pathogenic variants.


Assuntos
Éxons , Íntrons , Complexos Multienzimáticos , Trombocitopenia , Humanos , Masculino , Feminino , Complexos Multienzimáticos/genética , Éxons/genética , Íntrons/genética , Adulto , Trombocitopenia/genética , Miopatias Distais/genética , Adulto Jovem , Adolescente , Criança , Músculo Esquelético/metabolismo , Músculo Esquelético/patologia , Linhagem , Pessoa de Meia-Idade
3.
Brain ; 146(3): 1075-1082, 2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-35481544

RESUMO

While many genetic causes of movement disorders have been identified, modifiers of disease expression are largely unknown. X-linked dystonia-parkinsonism (XDP) is a neurodegenerative disease caused by a SINE-VNTR-Alu(AGAGGG)n retrotransposon insertion in TAF1, with a polymorphic (AGAGGG)n repeat. Repeat length and variants in MSH3 and PMS2 explain ∼65% of the variance in age at onset (AAO) in XDP. However, additional genetic modifiers are conceivably at play in XDP, such as repeat interruptions. Long-read nanopore sequencing of PCR amplicons from XDP patients (n = 202) was performed to assess potential repeat interruption and instability. Repeat-primed PCR and Cas9-mediated targeted enrichment confirmed the presence of identified divergent repeat motifs. In addition to the canonical pure SINE-VNTR-Alu-5'-(AGAGGG)n, we observed a mosaic of divergent repeat motifs that polarized at the beginning of the tract, where the divergent repeat interruptions varied in motif length by having one, two, or three nucleotides fewer than the hexameric motif, distinct from interruptions in other disease-associated repeats, which match the lengths of the canonical motifs. All divergent configurations occurred mosaically and in two investigated brain regions (basal ganglia, cerebellum) and in blood-derived DNA from the same patient. The most common divergent interruption was AGG [5'-SINE-VNTR-Alu(AGAGGG)2AGG(AGAGGG)n], similar to the pure tract, followed by AGGG [5'-SINE-VNTR-Alu(AGAGGG)2AGGG(AGAGGG)n], at median frequencies of 0.425 (IQR: 0.42-0.43) and 0.128 (IQR: 0.12-0.13), respectively. The mosaic AGG motif was not associated with repeat number (estimate = -3.8342, P = 0.869). The mosaic pure tract frequency was associated with repeat number (estimate = 45.32, P = 0.0441) but not AAO (estimate = -41.486, P = 0.378). Importantly, the mosaic frequency of the AGGG negatively correlated with repeat number after adjusting for age at sampling (estimate = -161.09, P = 3.44 × 10-5). When including the XDP-relevant MSH3/PMS2 modifier single nucleotide polymorphisms into the model, the mosaic AGGG frequency was associated with AAO (estimate = 155.1063, P = 0.047); however, the association dissipated after including the repeat number (estimate = -92.46430, P = 0.079). We reveal novel mosaic divergent repeat interruptions affecting both motif length and sequence (DRILS) of the canonical motif polarized within the SINE-VNTR-Alu(AGAGGG)n repeat. Our study illustrates: (i) the importance of somatic mosaic genotypes; (ii) the biological plausibility of multiple modifiers (both germline and somatic) that can have additive effects on repeat instability; and (iii) that these variations may remain undetected without assessment of single molecules.


Assuntos
Distúrbios Distônicos , Doenças Genéticas Ligadas ao Cromossomo X , Doenças Neurodegenerativas , Humanos , Endonuclease PMS2 de Reparo de Erro de Pareamento , Distúrbios Distônicos/genética , Doenças Genéticas Ligadas ao Cromossomo X/genética
4.
Mol Cell Proteomics ; 21(7): 100254, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35654359

RESUMO

All human diseases involve proteins, yet our current tools to characterize and quantify them are limited. To better elucidate proteins across space, time, and molecular composition, we provide a >10 years of projection for technologies to meet the challenges that protein biology presents. With a broad perspective, we discuss grand opportunities to transition the science of proteomics into a more propulsive enterprise. Extrapolating recent trends, we describe a next generation of approaches to define, quantify, and visualize the multiple dimensions of the proteome, thereby transforming our understanding and interactions with human disease in the coming decade.


Assuntos
Proteoma , Proteômica , Humanos , Proteoma/metabolismo , Proteômica/métodos
5.
Med Princ Pract ; 33(3): 215-231, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38442703

RESUMO

HLA typing serves as a standard practice in hematopoietic stem cell transplantation to ensure compatibility between donors and recipients, preventing the occurrence of allograft rejection and graft-versus-host disease. Conventional laboratory methods that have been widely employed in the past few years, including sequence-specific primer PCR and sequencing-based typing (SBT), currently face the risk of becoming obsolete. This risk stems not only from the extensive diversity within HLA genes but also from the rapid advancement of next-generation sequencing and third-generation sequencing technologies. Third-generation sequencing systems like single-molecule real-time (SMRT) sequencing and Oxford Nanopore (ONT) sequencing have the capability to analyze long-read sequences that span entire intronic-exonic regions of HLA genes, effectively addressing challenges related to HLA ambiguity and the phasing of multiple short-read fragments. The growing dominance of these advanced sequencers in HLA typing is expected to solidify further through ongoing refinements, cost reduction, and error rate minimization. This review focuses on hematopoietic stem cell transplantation (HSCT) and explores prospective advancements and application of HLA DNA typing techniques. It explores how the adoption of third-generation sequencing technologies can revolutionize the field by offering improved accuracy, reduced ambiguity, and enhanced assessment of compatibility in HSCT. Embracing these cutting-edge technologies is essential to advancing the success rates and outcomes of hematopoietic stem cell transplantation. This review underscores the importance of staying at the forefront of HLA typing techniques to ensure the best possible outcomes for patients undergoing HSCT.


Assuntos
Transplante de Células-Tronco Hematopoéticas , Sequenciamento de Nucleotídeos em Larga Escala , Teste de Histocompatibilidade , Humanos , Teste de Histocompatibilidade/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Doença Enxerto-Hospedeiro/prevenção & controle , Antígenos HLA/genética , Análise de Sequência de DNA/métodos
6.
Lab Invest ; 103(4): 100043, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36870287

RESUMO

Amplification biases caused by next-generation sequencing (NGS) for noninvasive prenatal screening (NIPS) may be reduced using single-molecule sequencing (SMS), during which PCR is omitted. Therefore, the performance of SMS-based NIPS was evaluated. We used SMS-based NIPS to screen for common fetal aneuploidies in 477 pregnant women. The sensitivity, specificity, positive predictive value, and negative predictive value were estimated. The GC-induced bias was compared between the SMS- and NGS-based NIPS methods. Notably, a sensitivity of 100% was achieved for fetal trisomy 13 (T13), trisomy 18 (T18), and trisomy 21 (T21). The positive predictive value was 46.15% for T13, 96.77% for T18, and 99.07% for T21. The overall specificity was 100% (334/334). Compared with NGS, SMS (without PCR) had less GC bias, a better distinction between T21 or T18 and euploidies, and better diagnostic performance. Overall, our results suggest that SMS improves the performance of NIPS for common fetal aneuploidies by reducing the GC bias introduced during library preparation and sequencing.


Assuntos
Síndrome de Down , Teste Pré-Natal não Invasivo , Gravidez , Feminino , Humanos , Aneuploidia , Síndrome de Down/diagnóstico , Síndrome de Down/genética , Valor Preditivo dos Testes , Sequenciamento de Nucleotídeos em Larga Escala/métodos
7.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33483726

RESUMO

Extended turnaround times and large economic costs hinder the usage of currently applied screening methods for bacterial pathogen identification (ID) and antimicrobial susceptibility testing. This review provides an overview of current detection methods and their usage in a clinical setting. Issues of timeliness and cost could soon be circumvented, however, with the emergence of detection methods involving single molecule sequencing technology. In the context of bringing diagnostics closer to the point of care, we examine the current state of Oxford Nanopore Technologies (ONT) products and their interaction with third-party software/databases to assess their capabilities for ID and antimicrobial resistance (AMR) prediction. We outline and discuss a potential diagnostic workflow, enumerating (1) rapid sample prep kits, (2) ONT hardware/software and (3) third-party software and databases to improve the cost, accuracy and turnaround times for ID and AMR. Multiple studies across a range of infection types support that the speed and accuracy of ONT sequencing is now such that established ID and AMR prediction tools can be used on its outputs, and so it can be harnessed for near real time, close to the point-of-care diagnostics in common clinical circumstances.


Assuntos
Bactérias/genética , Infecções Bacterianas/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento por Nanoporos/métodos , RNA Ribossômico 16S/genética , RNA Ribossômico 23S/genética , Antibacterianos/farmacologia , Bactérias/classificação , Bactérias/efeitos dos fármacos , Bactérias/crescimento & desenvolvimento , Infecções Bacterianas/tratamento farmacológico , Infecções Bacterianas/microbiologia , Farmacorresistência Bacteriana/genética , Humanos , Testes de Sensibilidade Microbiana , Testes Imediatos , Software
8.
Clin Chem ; 69(2): 168-179, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36322427

RESUMO

BACKGROUND: Recent studies using single molecule, real-time (SMRT) sequencing revealed a substantial population of analyzable long cell-free DNA (cfDNA) in plasma. Potential clinical utilities of such long cfDNA in pregnancy and cancer have been demonstrated. However, the performance of different long-read sequencing platforms for the analysis of long cfDNA remains unknown. METHODS: Size biases of SMRT sequencing by Pacific Biosciences (PacBio) and nanopore sequencing by Oxford Nanopore Technologies (ONT) were evaluated using artificial mixtures of sonicated human and mouse DNA of different sizes. cfDNA from plasma samples of pregnant women at different trimesters, hepatitis B carriers, and patients with hepatocellular carcinoma were sequenced with the 2 platforms. RESULTS: Both platforms showed biases to sequence longer (1500 bp vs 200 bp) DNA fragments, with PacBio showing a stronger bias (5-fold overrepresentation of long fragments vs 2-fold in ONT). Percentages of cfDNA fragments 500 bp were around 6-fold higher in PacBio compared with ONT. End motif profiles of cfDNA from PacBio and ONT were similar, yet exhibited platform-dependent patterns. Tissue-of-origin analysis based on single-molecule methylation patterns showed comparable performance on both platforms. CONCLUSIONS: SMRT sequencing generated data with higher percentages of long cfDNA compared with nanopore sequencing. Yet, a higher number of long cfDNA fragments eligible for the tissue-of-origin analysis could be obtained from nanopore sequencing due to its much higher throughput. When analyzing the size and end motif of cfDNA, one should be aware of the analytical characteristics and possible biases of the sequencing platforms being used.


Assuntos
Ácidos Nucleicos Livres , Neoplasias Hepáticas , Sequenciamento por Nanoporos , Humanos , Feminino , Gravidez , Animais , Camundongos , Ácidos Nucleicos Livres/genética , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , DNA/genética
9.
Transfusion ; 63(8): 1441-1446, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37165957

RESUMO

BACKGROUND: The Kidd blood group gene SLC14A1 (JK) accounts for approximately 20 Kb from initiation codon to stop codon in the genome. In genomic DNA analysis using Sanger sequencing or short-read-based next generation sequencing, it is difficult to determine the cis or trans positions of single nucleotide variations (SNVs), which are occasionally more than 1 Kb away from each other. We aimed to determine the complete nucleotide sequence of a 20-Kb genomic DNA amplicon to characterize the JK allelic variants associated with Kidd antigen silencing in a blood donor. STUDY DESIGN AND METHODS: The Jk(a-b-) phenotype was identified in this donor by standard serological typing. A DNA sample obtained from whole blood was amplified by long-range PCR to obtain a 20-Kb fragment of the SLC14A1 gene, including the initiation and stop codons. The fragment was then analyzed by Sanger sequencing and single-molecule sequencing. Transfection and expression studies were performed in CHO cells using the expression vector construct of JK alleles. RESULTS: Sanger sequencing and single-molecule sequencing revealed that the donor was heterozygous with JK*01 having c.276G>A (rs763262711, p.Trp92Ter) and JK*02 having c.499A>G (rs2298719, p.Met167Val), c.588A>G (rs2298718, p.Pro196Pro), and c.743C>A (p.Ala248Asp). The two JK alleles identified have not been previously described. Transfection and expression studies indicated that the CHO cells transfected with JK*02 having c.743C>A did not express the Jkb and Jk3 antigens. CONCLUSIONS: We identified new JK silencing alleles and their critical SNVs by single-molecule sequencing and the findings were confirmed by transfection and expression studies.


Assuntos
DNA , Sistema do Grupo Sanguíneo Kidd , Animais , Cricetinae , Sistema do Grupo Sanguíneo Kidd/genética , Alelos , Cricetulus , Heterozigoto
10.
Int J Mol Sci ; 24(6)2023 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-36982901

RESUMO

As important pollinators, honey bees play a crucial role in both maintaining the ecological balance and providing products for humans. Although several versions of the western honey bee genome have already been published, its transcriptome information still needs to be refined. In this study, PacBio single-molecule sequencing technology was used to sequence the full-length transcriptome of mixed samples from many developmental time points and tissues of A. mellifera queens, workers and drones. A total of 116,535 transcripts corresponding to 30,045 genes were obtained. Of these, 92,477 transcripts were annotated. Compared to the annotated genes and transcripts on the reference genome, 18,915 gene loci and 96,176 transcripts were newly identified. From these transcripts, 136,554 alternative splicing (AS) events, 23,376 alternative polyadenylation (APA) sites and 21,813 lncRNAs were detected. In addition, based on the full-length transcripts, we identified many differentially expressed transcripts (DETs) between queen, worker and drone. Our results provide a complete set of reference transcripts for A. mellifera that dramatically expand our understanding of the complexity and diversity of the honey bee transcriptome.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Transcriptoma , Humanos , Abelhas/genética , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Processamento Alternativo , Análise de Sequência de RNA , Anotação de Sequência Molecular
11.
BMC Bioinformatics ; 23(1): 95, 2022 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-35307007

RESUMO

BACKGROUND: Third-generation sequencing offers some advantages over next-generation sequencing predecessors, but with the caveat of harboring a much higher error rate. Clustering-related sequences is an essential task in modern biology. To accurately cluster sequences rich in errors, error type and frequency need to be accounted for. Levenshtein distance is a well-established mathematical algorithm for measuring the edit distance between words and can specifically weight insertions, deletions and substitutions. However, there are drawbacks to using Levenshtein distance in a biological context and hence has rarely been used for this purpose. We present novel modifications to the Levenshtein distance algorithm to optimize it for clustering error-rich biological sequencing data. RESULTS: We successfully introduced a bidirectional frameshift allowance with end-user determined accommodation caps combined with weighted error discrimination. Furthermore, our modifications dramatically improved the computational speed of Levenstein distance. For simulated ONT MinION and PacBio Sequel datasets, the average clustering sensitivity for 3GOLD was 41.45% (S.D. 10.39) higher than Sequence-Levenstein distance, 52.14% (S.D. 9.43) higher than Levenshtein distance, 55.93% (S.D. 8.67) higher than Starcode, 42.68% (S.D. 8.09) higher than CD-HIT-EST and 61.49% (S.D. 7.81) higher than DNACLUST. For biological ONT MinION data, 3GOLD clustering sensitivity was 27.99% higher than Sequence-Levenstein distance, 52.76% higher than Levenshtein distance, 56.39% higher than Starcode, 48% higher than CD-HIT-EST and 70.4% higher than DNACLUST. CONCLUSION: Our modifications to Levenshtein distance have improved its speed and accuracy compared to the classic Levenshtein distance, Sequence-Levenshtein distance and other commonly used clustering approaches on simulated and biological third-generation sequenced datasets. Our clustering approach is appropriate for datasets of unknown cluster centroids, such as those generated with unique molecular identifiers as well as known centroids such as barcoded datasets. A strength of our approach is high accuracy in resolving small clusters and mitigating the number of singletons.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Análise por Conglomerados , Análise de Sequência de DNA
12.
Hum Mutat ; 43(2): 189-199, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34859533

RESUMO

Synpolydactyly 1, also called syndactyly type II (SDTY2), is a genetic limb malformation characterized by polydactyly with syndactyly involving the webbing of the third and fourth fingers, and the fourth and fifth toes. It is caused by heterozygous alterations in HOXD13 with incomplete penetrance and phenotypic variability. In our study, a five-generation family with an SPD phenotype was enrolled in our Rare Disease Genomics Protocol. A comprehensive examination of three generations using Illumina short-read whole-genome sequencing (WGS) did not identify any causative variants. Subsequent WGS using Pacific Biosciences (PacBio) long-read HiFi Circular Consensus Sequencing (CCS) revealed a heterozygous 27-bp duplication in the polyalanine tract of HOXD13. Sanger sequencing of all available family members confirmed that the variant segregates with affected individuals. Reanalysis of an unrelated family with a similar SPD phenotype uncovered a 21-bp (7-alanine) duplication in the same region of HOXD13. Although ExpansionHunter identified these events in most individuals in a retrospective analysis, low sequence coverage due to high GC content in the HOXD13 polyalanine tract makes detection of these events challenging. Our findings highlight the value of long-read WGS in elucidating the molecular etiology of congenital limb malformation disorders.


Assuntos
Proteínas de Homeodomínio , Sindactilia , Fatores de Transcrição , Proteínas de Homeodomínio/genética , Humanos , Linhagem , Estudos Retrospectivos , Sindactilia/genética , Fatores de Transcrição/genética , Sequenciamento Completo do Genoma
13.
Brief Bioinform ; 21(6): 1971-1986, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-31792498

RESUMO

A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats over the last 5 years. However, different custom bioinformatics pipelines were employed in each study, preventing meaningful comparisons and somewhat limiting the reproducibility of the results. In this review, we provide a brief summary of state-of-the-art methods for the characterization of expanded repeats alleles, along with a detailed comparison of bioinformatics tools for the determination of repeat length and sequence, using both real and simulated data. Our reanalysis of publicly available human genome sequencing data suggests a modest, but statistically significant, increase of the error rate of single-molecule sequencing technologies at genomic regions containing short tandem repeats. However, we observe that all the methods herein tested, irrespective of the strategy used for the analysis of the data (either based on the alignment or assembly of the reads), show high levels of sensitivity in both the detection of expanded tandem repeats and the estimation of the expansion size, suggesting that approaches based on single-molecule sequencing technologies are highly effective for the detection and quantification of tandem repeat expansions and contractions.


Assuntos
Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Dados de Sequência Molecular , Análise de Sequência de DNA , Alelos , Mapeamento Cromossômico , Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
14.
Phytopathology ; 112(4): 973-975, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34645321

RESUMO

Elsinoë batatas is a phytopathogenic fungus causing stem and foliage scab disease of sweet potato. At present, there is no reference genome available for E. batatas, limiting basic research for the pathogen. The present study applied the Nanopore single-molecule sequencing technology to sequence the E. batatas genome. This study reports the first high-quality genome sequence of E. batatas, with a total contig size of 26.49 Mb, 50.8% GC content, and an N50 of 2,546,814 bp. The sequences obtained serve as a reference for analysis of E. batatas isolates and provide a resource to better understand the biology of stem and foliage scab disease of sweet potato.


Assuntos
Ascomicetos , Ipomoea batatas , Ascomicetos/genética , Ipomoea batatas/genética , Doenças das Plantas/microbiologia
15.
BMC Biol ; 19(1): 108, 2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-34016118

RESUMO

BACKGROUND: The majority of the human genome is transcribed in the form of long non-coding (lnc) RNAs. While these transcripts have attracted considerable interest, their molecular mechanisms of function and biological significance remain controversial. One of the main reasons behind this lies in the significant challenges posed by lncRNAs requiring the development of novel methods and concepts to unravel their functionality. Existing methods often lack cross-validation and independent confirmation by different methodologies and therefore leave significant ambiguity as to the authenticity of the outcomes. Nonetheless, despite all the caveats, it appears that lncRNAs may function, at least in part, by regulating other genes via chromatin interactions. Therefore, the function of a lncRNA could be inferred from the function of genes it regulates. In this work, we present a genome-wide functional annotation strategy for lncRNAs based on identification of their regulatory networks via the integration of three distinct types of approaches: co-expression analysis, mapping of lncRNA-chromatin interactions, and assaying molecular effects of lncRNA knockdowns obtained using an inducible and highly specific CRISPR/Cas13 system. RESULTS: We applied the strategy to annotate 407 very long intergenic non-coding (vlinc) RNAs belonging to a novel widespread subclass of lncRNAs. We show that vlincRNAs indeed appear to regulate multiple genes encoding proteins predominantly involved in RNA- and development-related functions, cell cycle, and cellular adhesion via a mechanism involving proximity between vlincRNAs and their targets in the nucleus. A typical vlincRNAs can be both a positive and negative regulator and regulate multiple genes both in trans and cis. Finally, we show vlincRNAs and their regulatory networks potentially represent novel components of DNA damage response and are functionally important for the ability of cancer cells to survive genotoxic stress. CONCLUSIONS: This study provides strong evidence for the regulatory role of the vlincRNA class of lncRNAs and a potentially important role played by these transcripts in the hidden layer of RNA-based regulation in complex biological systems.


Assuntos
RNA Longo não Codificante/genética , Núcleo Celular , Cromatina/genética , Humanos
16.
J Proteome Res ; 20(6): 3395-3399, 2021 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-33904308

RESUMO

While mass spectrometry still dominates proteomics research, alternative and potentially disruptive, next-generation technologies are receiving increased investment and attention. Most of these technologies aim at the sequencing of single peptide or protein molecules, typically labeling or otherwise distinguishing a subset of the proteinogenic amino acids. This note considers some theoretical aspects of these future technologies from a bottom-up proteomics viewpoint, including the ability to uniquely identify human proteins as a function of which and how many amino acids can be read, enzymatic efficiency, and the maximum read length. This is done through simulations under ideal and non-ideal conditions to set benchmarks for what may be achievable with future single-molecule sequencing technology. The simulations reveal, among other observations, that the best choice of reading N amino acids performs similarly to the average choice of N+1 amino acids, and that the discrimination power of the amino acids scales with their frequency in the proteome. The simulations are agnostic with respect to the next-generation proteomics platform, and the results and conclusions should therefore be applicable to any single-molecule partial peptide sequencing technology.


Assuntos
Proteoma , Proteômica , Sequência de Aminoácidos , Humanos , Espectrometria de Massas , Peptídeos
17.
BMC Genomics ; 22(1): 513, 2021 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-34233619

RESUMO

BACKGROUND: Direct-sequencing technologies, such as Oxford Nanopore's, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. RESULT: Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. CONCLUSIONS: Sequoia's interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia .


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Sequoia , Células HeLa , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Software
18.
BMC Genomics ; 22(1): 643, 2021 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-34488624

RESUMO

BACKGROUND: The lower Dipteran fungus fly, Sciara coprophila, has many unique biological features that challenge the rule of genome DNA constancy. For example, Sciara undergoes paternal chromosome elimination and maternal X chromosome nondisjunction during spermatogenesis, paternal X elimination during embryogenesis, intrachromosomal DNA amplification of DNA puff loci during larval development, and germline-limited chromosome elimination from all somatic cells. Paternal chromosome elimination in Sciara was the first observation of imprinting, though the mechanism remains a mystery. Here, we present the first draft genome sequence for Sciara coprophila to take a large step forward in addressing these features. RESULTS: We assembled the Sciara genome using PacBio, Nanopore, and Illumina sequencing. To find an optimal assembly using these datasets, we generated 44 short-read and 50 long-read assemblies. We ranked assemblies using 27 metrics assessing contiguity, gene content, and dataset concordance. The highest-ranking assemblies were scaffolded using BioNano optical maps. RNA-seq datasets from multiple life stages and both sexes facilitated genome annotation. A set of 66 metrics was used to select the first draft assembly for Sciara. Nearly half of the Sciara genome sequence was anchored into chromosomes, and all scaffolds were classified as X-linked or autosomal by coverage. CONCLUSIONS: We determined that X-linked genes in Sciara males undergo dosage compensation. An entire bacterial genome from the Rickettsia genus, a group known to be endosymbionts in insects, was co-assembled with the Sciara genome, opening the possibility that Rickettsia may function in sex determination in Sciara. Finally, the signal level of the PacBio and Nanopore data support the presence of cytosine and adenine modifications in the Sciara genome, consistent with a possible role in imprinting.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Cromossomo X , DNA , Feminino , Fungos , Humanos , Masculino , Análise de Sequência de DNA
19.
Brief Bioinform ; 20(3): 866-876, 2019 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-29112696

RESUMO

Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms.


Assuntos
Genoma , Análise de Sequência de DNA/métodos , Animais , Caenorhabditis elegans/genética , Escherichia coli/genética , Ipomoea/genética , Plasmodium falciparum/genética
20.
Brief Bioinform ; 19(1): 23-40, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-27742661

RESUMO

As the advent of next-generation sequencing (NGS) technology, various de novo assembly algorithms based on the de Bruijn graph have been developed to construct chromosome-level sequences. However, numerous technical or computational challenges in de novo assembly still remain, although many bright ideas and heuristics have been suggested to tackle the challenges in both experimental and computational settings. In this review, we categorize de novo assemblers on the basis of the type of de Bruijn graphs (Hamiltonian and Eulerian) and discuss the challenges of de novo assembly for short NGS reads regarding computational complexity and assembly ambiguity. Then, we discuss how the limitations of the short reads can be overcome by using a single-molecule sequencing platform that generates long reads of up to several kilobases. In fact, the long read assembly has caused a paradigm shift in whole-genome assembly in terms of algorithms and supporting steps. We also summarize (i) hybrid assemblies using both short and long reads and (ii) overlap-based assemblies for long reads and discuss their challenges and future prospects. This review provides guidelines to determine the optimal approach for a given input data type, computational budget or genome.


Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento Completo do Genoma/métodos , Algoritmos , Genômica , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA