Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 115
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell ; 82(1): 209-217.e7, 2022 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-34951964

RESUMO

Extrachromosomal circular DNA (eccDNA) is common in somatic tissue, but its existence and effects in the human germline are unexplored. We used microscopy, long-read DNA sequencing, and new analytic methods to document thousands of eccDNAs from human sperm. EccDNAs derived from all genomic regions and mostly contained a single DNA fragment, although some consisted of multiple fragments. The generation of eccDNA inversely correlates with the meiotic recombination rate, and chromosomes with high coding-gene density and Alu element abundance form the least eccDNA. Analysis of insertions in human genomes further indicates that eccDNA can persist in the human germline when the circular molecules reinsert themselves into the chromosomes. Our results suggest that eccDNA has transient and permanent effects on the germline. They explain how differences in the physical and genetic map might arise and offer an explanation of how Alu elements coevolved with genes to protect genome integrity against deleterious mutations producing eccDNA.


Assuntos
Cromossomos Humanos , DNA Circular/metabolismo , Meiose , Recombinação Genética , Espermatozoides/metabolismo , Elementos Alu , DNA Circular/genética , Evolução Molecular , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Masculino , Mutação
2.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39222062

RESUMO

Accurate taxonomic profiling of microbial taxa in a metagenomic sample is vital to gain insights into microbial ecology. Recent advancements in sequencing technologies have contributed tremendously toward understanding these microbes at species resolution through a whole shotgun metagenomic approach. In this study, we developed a new bioinformatics tool, coverage-based analysis for identification of microbiome (CAIM), for accurate taxonomic classification and quantification within both long- and short-read metagenomic samples using an alignment-based method. CAIM depends on two different containment techniques to identify species in metagenomic samples using their genome coverage information to filter out false positives rather than the traditional approach of relative abundance. In addition, we propose a nucleotide-count-based abundance estimation, which yield lesser root mean square error than the traditional read-count approach. We evaluated the performance of CAIM on 28 metagenomic mock communities and 2 synthetic datasets by comparing it with other top-performing tools. CAIM maintained a consistently good performance across datasets in identifying microbial taxa and in estimating relative abundances than other tools. CAIM was then applied to a real dataset sequenced on both Nanopore (with and without amplification) and Illumina sequencing platforms and found high similarity of taxonomic profiles between the sequencing platforms. Lastly, CAIM was applied to fecal shotgun metagenomic datasets of 232 colorectal cancer patients and 229 controls obtained from 4 different countries and 44 primary liver cancer patients and 76 controls. The predictive performance of models using the genome-coverage cutoff was better than those using the relative-abundance cutoffs in discriminating colorectal cancer and primary liver cancer patients from healthy controls with a highly confident species markers.


Assuntos
Metagenômica , Microbiota , Humanos , Microbiota/genética , Metagenômica/métodos , Biologia Computacional/métodos , Metagenoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Algoritmos , Análise de Sequência de DNA/métodos
3.
Nucleic Acids Res ; 51(9): 4148-4177, 2023 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-37094040

RESUMO

DNA sequence composition determines the topology and stability of G-quadruplexes (G4s). Bulged G-quadruplex structures (G4-Bs) are a subset of G4s characterized by 3D conformations with bulges. Current search algorithms fail to capture stable G4-B, making their genome-wide study infeasible. Here, we introduced a large family of computationally defined and experimentally verified potential G4-B forming sequences (pG4-BS). We found 478 263 pG4-BS regions that do not overlap 'canonical' G4-forming sequences in the human genome and are preferentially localized in transcription regulatory regions including R-loops and open chromatin. Over 90% of protein-coding genes contain pG4-BS in their promoter or gene body. We observed generally higher pG4-BS content in R-loops and their flanks, longer genes that are associated with brain tissue, immune and developmental processes. Also, the presence of pG4-BS on both template and non-template strands in promoters is associated with oncogenesis, cardiovascular disease and stemness. Our G4-BS models predicted G4-forming ability in vitro with 91.5% accuracy. Analysis of G4-seq and CUT&Tag data strongly supports the existence of G4-BS conformations genome-wide. We reconstructed a novel G4-B 3D structure located in the E2F8 promoter. This study defines a large family of G4-like sequences, offering new insights into the essential biological functions and potential future therapeutic uses of G4-B.


Assuntos
Quadruplex G , Humanos , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Regiões Promotoras Genéticas , Sequência de Bases
4.
Artigo em Inglês | MEDLINE | ID: mdl-39082161

RESUMO

BACKGROUND: Methicillin-resistant Staphylococcus haemolyticus (MRSH) is an important pathogenic agent of bovine mastitis. Among the prominent clone lineages in dairy cows are MRSH sequence types ST3 and ST42. Little information is available on the complete characterization of SCCmec elements in MRSH. OBJECTIVE: In this study, two clinical isolates of MRSH ST3 and ST42 from bovine mastitis milk were selected, and their nontypable SCCmec structures were compared. METHODS: Two MRSH strains, MRSH-ST3 strain M62.3 and MRSH-ST42 strain M81.1, were identified from bovine mastitis milk in Thailand in 2022. Minimum inhibitory concentration was used to screen for antimicrobial resistance susceptibility. Oxford Nanopore Technologies and Illumina sequencing were performed in combination to complete the genome. Their gene organization and structure of SCCmec types were analysed and compared with the whole sequences of other strains in the same sequence types. RESULTS: Both MRSH-ST3 strain M62.3 and MRSH-ST42 strain M81.1 possessed the class C1 mec complex but lacked the ccr gene complex. Notably, MRSH-ST42 strain M81.1 contained a novel variant of C1 mec complex, which consisted of IS431-mecA-ISSha1-paaZ-upgQ-IS431, with IS431 organized in the same orientation. Apart from class C1 mec and the heavy metal-resistant cluster, the gene composition and order of the SCCmec element varied. In ST3, variations in the SCCmec type, gene content and organization were observed. CONCLUSIONS: The distinct evolution of the MRSH lineage was indicated by the various SCCmec elements. The insertion of ISSha1 resulted in a unique variant of class C1 mec complex that demonstrated the important role of the insertion sequence in SCCmec diversification.

5.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36198068

RESUMO

Extrachromosomal circular DNA (eccDNA) of chromosomal origin is found in many eukaryotic species and cell types, including cancer, where eccDNAs with oncogenes drive tumorigenesis. Most studies of eccDNA employ short-read sequencing for their identification. However, short-read sequencing cannot resolve the complexity of genomic repeats, which can lead to missing eccDNA products. Long-read sequencing technologies provide an alternative to constructing complete eccDNA maps. We present a software suite, Construction-based Rolling-circle-amplification for eccDNA Sequence Identification and Location (CReSIL), to identify and characterize eccDNA from long-read sequences. CReSIL's performance in identifying eccDNA, with a minimum F1 score of 0.98, is superior to the other bioinformatic tools based on simulated data. CReSIL provides many useful features for genomic annotation, which can be used to infer eccDNA function and Circos visualization for eccDNA architecture investigation. We demonstrated CReSIL's capability in several long-read sequencing datasets, including datasets enriched for eccDNA and whole genome datasets from cells containing large eccDNA products. In conclusion, the CReSIL suite software is a versatile tool for investigating complex and simple eccDNA in eukaryotic cells.


Assuntos
DNA Circular , Genoma , DNA Circular/genética , DNA/genética , Células Eucarióticas
6.
Curr Microbiol ; 81(8): 221, 2024 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-38874629

RESUMO

Schaalia turicensis is facultative anaerobic Gram-positive bacillus that commonly inhabits the oropharynx, gastrointestinal, and genitourinary tract of healthy individuals. This organism has been co-isolated with Neisseria gonorrhoeae from 15-year-old Thai male patient with gonococcal urethritis in Bangkok, Thailand. In this study, we characterized the class 1 integron in S. turicensis isolate using whole-genome sequencing and bioinformatics analysis. Sequencing analysis confirmed the presence of an imperfect class 1 integron located on chromosome and a novel 24.5-kb-long composite transposon, named Tn7083. The transposon Tn7083 carried genes encoding chloramphenicol resistance (cmx), sulfonamide resistance (sul1), and aminoglycoside resistance [aph(6)-Id (strB), aph(3'')-Ib (strA), aph(3')-Ia].


Assuntos
Antibacterianos , Genoma Bacteriano , Gonorreia , Uretrite , Humanos , Masculino , Tailândia , Uretrite/microbiologia , Gonorreia/microbiologia , Antibacterianos/farmacologia , Adolescente , Sequenciamento Completo do Genoma , Testes de Sensibilidade Microbiana , Neisseria gonorrhoeae/genética , Neisseria gonorrhoeae/isolamento & purificação , Neisseria gonorrhoeae/classificação , Neisseria gonorrhoeae/efeitos dos fármacos , Elementos de DNA Transponíveis/genética , Farmacorresistência Bacteriana/genética
7.
BMC Genomics ; 24(1): 405, 2023 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-37468842

RESUMO

BACKGROUND: Preterm labor syndrome is associated with high perinatal morbidity and mortality, and intra-amniotic infection is a cause of preterm labor. The standard identification of causative microorganisms is based on the use of biochemical phenotypes, together with broth dilution-based antibiotic susceptibility from organisms grown in culture. However, such methods could not provide an accurate epidemiological aspect and a genetic basis of antimicrobial resistance leading to an inappropriate antibiotic administration. Hybrid genome assembly is a combination of short- and long-read sequencing, which provides better genomic resolution and completeness for genotypic identification and characterization. Herein, we performed a hybrid whole genome assembly sequencing of a pathogen associated with acute histologic chorioamnionitis in women presenting with PPROM. RESULTS: We identified Enterococcus faecium, namely E. faecium strain RAOG174, with several antibiotic resistance genes, including vancomycin and aminoglycoside. Virulence-associated genes and potential bacteriophage were also identified in this genome. CONCLUSION: We report herein the first study demonstrating the use of hybrid genome assembly and genomic analysis to identify E. faecium ST17 as a pathogen associated with acute histologic chorioamnionitis. The analysis provided several antibiotic resistance-associated genes/mutations and mobile genetic elements. The occurrence of E. faecium ST17 raised the awareness of the colonization of clinically relevant E. faecium and the carrying of antibiotic resistance. This finding has brought the advantages of genomic approach in the identification of the bacterial species and antibiotic resistance gene for E. faecium for appropriate antibiotic use to improve maternal and neonatal care.


Assuntos
Corioamnionite , Enterococcus faecium , Infecções por Bactérias Gram-Positivas , Trabalho de Parto Prematuro , Gravidez , Humanos , Feminino , Antibacterianos/farmacologia , Antibacterianos/uso terapêutico , Corioamnionite/genética , Corioamnionite/tratamento farmacológico , Enterococcus faecium/genética , Genômica , Trabalho de Parto Prematuro/tratamento farmacológico , Resistência Microbiana a Medicamentos , Infecções por Bactérias Gram-Positivas/microbiologia
8.
BMC Plant Biol ; 23(1): 59, 2023 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-36707785

RESUMO

BACKGROUND: Massive parallel sequencing technologies have enabled the elucidation of plant phylogenetic relationships from chloroplast genomes at a high pace. These include members of the family Rhamnaceae. The current Rhamnaceae phylogenetic tree is from 13 out of 24 Rhamnaceae chloroplast genomes, and only one chloroplast genome of the genus Ventilago is available. Hence, the phylogenetic relationships in Rhamnaceae remain incomplete, and more representative species are needed. RESULTS: The complete chloroplast genome of Ventilago harmandiana Pierre was outlined using a hybrid assembly of long- and short-read technologies. The accuracy and validity of the final genome were confirmed with PCR amplifications and investigation of coverage depth. Sanger sequencing was used to correct for differences in lengths and nucleotide bases between inverted repeats because of the homopolymers. The phylogenetic trees reconstructed using prevalent methods for phylogenetic inference were topologically similar. The clustering based on codon usage was congruent with the molecular phylogenetic tree. The groups of genera in each tribe were in accordance with tribal classification based on molecular markers. We resolved the phylogenetic relationships among six Hovenia species, three Rhamnus species, and two Ventilago species. Our reconstructed tree provides the most complete and reliable low-level taxonomy to date for the family Rhamnaceae. Similar to other higher plants, the RNA editing mostly resulted in converting serine to leucine. Besides, most genes were subjected to purifying selection. Annotation anomalies, including indel calling errors, unaligned open reading frames of the same gene, inconsistent prediction of intergenic regions, and misannotated genes, were identified in the published chloroplast genomes used in this study. These could be a result of the usual imperfections in computational tools, and/or existing errors in reference genomes. Importantly, these are points of concern with regards to utilizing published chloroplast genomes for comparative genomic analysis. CONCLUSIONS: In summary, we successfully demonstrated the use of comprehensive genomic data, including DNA and amino acid sequences, to build a reliable and high-resolution phylogenetic tree for the family Rhamnaceae. Additionally, our study indicates that the revision of genome annotation before comparative genomic analyses is necessary to prevent the propagation of errors and complications in downstream analysis and interpretation.


Assuntos
Genoma de Cloroplastos , Rhamnaceae , Genoma de Cloroplastos/genética , Rhamnaceae/genética , Filogenia , Genômica/métodos , Cloroplastos/genética
9.
Nucleic Acids Res ; 49(2): e7, 2021 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-32710622

RESUMO

Traditional epitranscriptomics relies on capturing a single RNA modification by antibody or chemical treatment, combined with short-read sequencing to identify its transcriptomic location. This approach is labor-intensive and may introduce experimental artifacts. Direct sequencing of native RNA using Oxford Nanopore Technologies (ONT) can allow for directly detecting the RNA base modifications, although these modifications might appear as sequencing errors. The percent Error of Specific Bases (%ESB) was higher for native RNA than unmodified RNA, which enabled the detection of ribonucleotide modification sites. Based on the %ESB differences, we developed a bioinformatic tool, epitranscriptional landscape inferring from glitches of ONT signals (ELIGOS), that is based on various types of synthetic modified RNA and applied to rRNA and mRNA. ELIGOS is able to accurately predict known classes of RNA methylation sites (AUC > 0.93) in rRNAs from Escherichiacoli, yeast, and human cells, using either unmodified in vitro transcription RNA or a background error model, which mimics the systematic error of direct RNA sequencing as the reference. The well-known DRACH/RRACH motif was localized and identified, consistent with previous studies, using differential analysis of ELIGOS to study the impact of RNA m6A methyltransferase by comparing wild type and knockouts in yeast and mouse cells. Lastly, the DRACH motif could also be identified in the mRNA of three human cell lines. The mRNA modification identified by ELIGOS is at the level of individual base resolution. In summary, we have developed a bioinformatic software package to uncover native RNA modifications.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Processamento Pós-Transcricional do RNA , RNA-Seq , Erro Científico Experimental , Software , Adenina/análogos & derivados , Adenina/análise , Animais , Linhagem Celular , Escherichia coli/genética , Humanos , Meiose , Metiltransferases/deficiência , Metiltransferases/metabolismo , Camundongos , Camundongos Knockout , Motivos de Nucleotídeos , RNA Bacteriano/genética , RNA Fúngico/genética , RNA Mensageiro/genética , RNA Ribossômico/genética , Curva ROC , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA , Moldes Genéticos , Transcrição Gênica
10.
J Perinat Med ; 51(6): 769-774, 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-36503654

RESUMO

OBJECTIVES: Early diagnosis and treatment of intra-amniotic infection is crucial. Rapid pathogen identification allows for a definite diagnosis and enables proper management. We determined whether the 16S amplicon sequencing performed by a nanopore sequencing technique make possible rapid bacterial identification at the species level in intra-amniotic infection. METHODS: Five cases of confirmed intra-amniotic infection, determined by either cultivation or 16S rDNA polymerase chain reaction (PCR) Sanger sequencing, and 10 cases of women who underwent mid-trimester genetic amniocentesis were included. DNA was extracted from amniotic fluid and PCR was performed on the full-length 16S rDNA. Nanopore sequencing was performed. The results derived from nanopore sequencing were compared with those derived from cultivation and Sanger sequencing methods. RESULTS: Bacteria were successfully detected from amniotic fluid using nanopore sequencing in all cases of intra-amniotic infection. Nanopore sequencing identified additional bacterial species and polymicrobial infections. All patients who underwent a mid-trimester amniocentesis had negative cultures, negative 16S PCR Sanger sequencing and nanopore sequencing. Identification of the microorganisms using nanopore sequencing technique at the bacterial species level was achieved within 5-9 h from DNA extraction. CONCLUSIONS: This is the first study demonstrating that the nanopore sequencing technique is capable of rapid diagnosis of intra-amniotic infection using fresh amniotic fluid samples.


Assuntos
Corioamnionite , Sequenciamento por Nanoporos , Nanoporos , Gravidez , Humanos , Feminino , Corioamnionite/diagnóstico , Corioamnionite/microbiologia , Líquido Amniótico/microbiologia , Amniocentese , Bactérias
11.
Infect Immun ; 89(4)2021 03 17.
Artigo em Inglês | MEDLINE | ID: mdl-33468580

RESUMO

Mutation of purR was previously shown to enhance the virulence of Staphylococcus aureus in a murine sepsis model, and this cannot be fully explained by increased expression of genes within the purine biosynthesis pathway. Rather, the increased production of specific S. aureus virulence factors, including alpha toxin and the fibronectin-binding proteins, was shown to play an important role. Mutation of purR was also shown previously to result in increased abundance of SarA. Here, we demonstrate by transposon sequencing that mutation of purR in the USA300 strain LAC increases fitness in a biofilm while mutation of sarA has the opposite effect. Therefore, we assessed the impact of sarA on reported purR-associated phenotypes by characterizing isogenic purR, sarA, and sarA/purR mutants. The results confirmed that mutation of purR results in increased abundance of alpha toxin, protein A, the fibronectin-binding proteins, and SarA, decreased production of extracellular proteases, an increased capacity to form a biofilm, and increased virulence in an osteomyelitis model. Mutation of sarA had the opposite effects on all of these phenotypes and, other than bacterial burdens in the bone, all of the phenotypes of sarA/purR mutants were comparable to those of sarA mutants. Limiting the production of extracellular proteases reversed all of the phenotypes of sarA mutants and most of those of sarA/purR mutants. We conclude that a critical component defining the virulence of a purR mutant is the enhanced production of SarA, which limits protease production to an extent that promotes the accumulation of critical S. aureus virulence factors.


Assuntos
Proteínas de Bactérias/biossíntese , Proteínas de Bactérias/genética , Endopeptidases/biossíntese , Mutação , Proteínas Repressoras/genética , Infecções Estafilocócicas/microbiologia , Staphylococcus aureus/fisiologia , Transativadores/biossíntese , Fatores de Virulência/genética , Animais , Biofilmes/crescimento & desenvolvimento , Elementos de DNA Transponíveis , Suscetibilidade a Doenças , Espaço Extracelular , Regulação Bacteriana da Expressão Gênica , Camundongos , Osteomielite/microbiologia , Staphylococcus aureus/patogenicidade , Virulência/genética
12.
Genome Res ; 27(11): 1783-1794, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-29030469

RESUMO

The stochastic dynamics and regulatory mechanisms that govern differentiation of individual human neural precursor cells (NPC) into mature neurons are currently not fully understood. Here, we used single-cell RNA-sequencing (scRNA-seq) of developing neurons to dissect/identify NPC subtypes and critical developmental stages of alternative lineage specifications. This study comprises an unsupervised, high-resolution strategy for identifying cell developmental bifurcations, tracking the stochastic transcript kinetics of the subpopulations, elucidating regulatory networks, and finding key regulators. Our data revealed the bifurcation and developmental tracks of the two NPC subpopulations, and we captured an early (24 h) transition phase that leads to alternative neuronal specifications. The consequent up-regulation and down-regulation of stage- and subpopulation-specific gene groups during the course of maturation revealed biological insights with regard to key regulatory transcription factors and lincRNAs that control cellular programs in the identified neuronal subpopulations.


Assuntos
Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Células-Tronco Neurais/citologia , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Diferenciação Celular , Linhagem da Célula , Células Cultivadas , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Neurogênese , RNA Longo não Codificante/genética , Fatores de Transcrição/genética
13.
Chem Res Toxicol ; 33(12): 2944-2952, 2020 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-32799528

RESUMO

Chemically induced DNA adducts can lead to mutations and cancer. Unfortunately, because common analytical methods (e.g., liquid chromatography-mass spectrometry) require adducts to be digested or liberated from DNA before quantification, information about their positions within the DNA sequence is lost. Advances in nanopore sequencing technologies allow individual DNA molecules to be analyzed at single-nucleobase resolution, enabling us to study the dynamic of epigenetic modifications and exposure-induced DNA adducts in their native forms on the DNA strand. We applied and evaluated the commercially available Oxford Nanopore Technology (ONT) sequencing platform for site-specific detection of DNA adducts and for distinguishing individual alkylated DNA adducts. Using ONT and the publicly available ELIGOS software, we analyzed a library of 15 plasmids containing site-specifically inserted O6- or N2-alkyl-2'-deoxyguanosine lesions differing in sizes and regiochemistries. Positions of DNA adducts were correctly located, and individual DNA adducts were clearly distinguished from each other.


Assuntos
Adutos de DNA/análise , DNA/química , Estrutura Molecular , Sequenciamento por Nanoporos , Tamanho da Partícula , Plasmídeos , Estereoisomerismo , Propriedades de Superfície
14.
Nucleic Acids Res ; 46(15): 7566-7585, 2018 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-29945198

RESUMO

R-loops are three-stranded RNA:DNA hybrid structures essential for many normal and pathobiological processes. Previously, we generated a quantitative R-loop forming sequence (RLFS) model, quantitative model of R-loop-forming sequences (QmRLFS) and predicted ∼660 000 RLFSs; most of them located in genes and gene-flanking regions, G-rich regions and disease-associated genomic loci in the human genome. Here, we conducted a comprehensive comparative analysis of these RLFSs using experimental data and demonstrated the high performance of QmRLFS predictions on the nucleotide and genome scales. The preferential co-localization of RLFS with promoters, U1 splice sites, gene ends, enhancers and non-B DNA structures, such as G-quadruplexes, provides evidence for the mechanical linkage between DNA tertiary structures, transcription initiation and R-loops in critical regulatory genome regions. We introduced and characterized an abundant class of reverse-forward RLFS clusters highly enriched in non-B DNA structures, which localized to promoters, gene ends and enhancers. The RLFS co-localization with promoters and transcriptionally active enhancers suggested new models for in cis and in trans regulation by RNA:DNA hybrids of transcription initiation and formation of 3D-chromatin loops. Overall, this study provides a rationale for the discovery and characterization of the non-B DNA regulatory structures involved in the formation of the RNA:DNA interactome as the basis for an emerging quantitative R-loop biology and pathobiology.


Assuntos
Biologia Computacional/métodos , Elementos Facilitadores Genéticos/genética , Quadruplex G , Genoma Humano/genética , Regiões Promotoras Genéticas/genética , DNA/química , DNA/genética , DNA/metabolismo , Regulação Neoplásica da Expressão Gênica , Humanos , Células K562 , Conformação de Ácido Nucleico , RNA/química , RNA/genética , RNA/metabolismo , Transcrição Gênica
15.
Nucleic Acids Res ; 46(7): e38, 2018 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-29346625

RESUMO

Completion of eukaryal genomes can be difficult task with the highly repetitive sequences along the chromosomes and short read lengths of second-generation sequencing. Saccharomyces cerevisiae strain CEN.PK113-7D, widely used as a model organism and a cell factory, was selected for this study to demonstrate the superior capability of very long sequence reads for de novo genome assembly. We generated long reads using two common third-generation sequencing technologies (Oxford Nanopore Technology (ONT) and Pacific Biosciences (PacBio)) and used short reads obtained using Illumina sequencing for error correction. Assembly of the reads derived from all three technologies resulted in complete sequences for all 16 yeast chromosomes, as well as the mitochondrial chromosome, in one step. Further, we identified three types of DNA methylation (5mC, 4mC and 6mA). Comparison between the reference strain S288C and strain CEN.PK113-7D identified chromosomal rearrangements against a background of similar gene content between the two strains. We identified full-length transcripts through ONT direct RNA sequencing technology. This allows for the identification of transcriptional landscapes, including untranslated regions (UTRs) (5' UTR and 3' UTR) as well as differential gene expression quantification. About 91% of the predicted transcripts could be consistently detected across biological replicates grown either on glucose or ethanol. Direct RNA sequencing identified many polyadenylated non-coding RNAs, rRNAs, telomere-RNA, long non-coding RNA and antisense RNA. This work demonstrates a strategy to obtain complete genome sequences and transcriptional landscapes that can be applied to other eukaryal organisms.


Assuntos
Genoma Fúngico/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Fúngico/genética , Saccharomyces cerevisiae/genética , Regiões 3' não Traduzidas/genética , Regiões 5' não Traduzidas/genética , Metilação de DNA/genética , Genômica , Nanoporos , RNA Longo não Codificante/genética , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA
16.
Proc Natl Acad Sci U S A ; 114(11): E2215-E2224, 2017 03 14.
Artigo em Inglês | MEDLINE | ID: mdl-28251929

RESUMO

Robust prognostic gene signatures and therapeutic targets are difficult to derive from expression profiling because of the significant heterogeneity within breast cancer (BC) subtypes. Here, we performed forward genetic screening in mice using Sleeping Beauty transposon mutagenesis to identify candidate BC driver genes in an unbiased manner, using a stabilized N-terminal truncated ß-catenin gene as a sensitizer. We identified 134 mouse susceptibility genes from 129 common insertion sites within 34 mammary tumors. Of these, 126 genes were orthologous to protein-coding genes in the human genome (hereafter, human BC susceptibility genes, hBCSGs), 70% of which are previously reported cancer-associated genes, and ∼16% are known BC suppressor genes. Network analysis revealed a gene hub consisting of E1A binding protein P300 (EP300), CD44 molecule (CD44), neurofibromin (NF1) and phosphatase and tensin homolog (PTEN), which are linked to a significant number of mutated hBCSGs. From our survival prediction analysis of the expression of human BC genes in 2,333 BC cases, we isolated a six-gene-pair classifier that stratifies BC patients with high confidence into prognostically distinct low-, moderate-, and high-risk subgroups. Furthermore, we proposed prognostic classifiers identifying three basal and three claudin-low tumor subgroups. Intriguingly, our hBCSGs are mostly unrelated to cell cycle/mitosis genes and are distinct from the prognostic signatures currently used for stratifying BC patients. Our findings illustrate the strength and validity of integrating functional mutagenesis screens in mice with human cancer transcriptomic data to identify highly prognostic BC subtyping biomarkers.


Assuntos
Neoplasias da Mama/genética , Transformação Celular Neoplásica/genética , Elementos de DNA Transponíveis , Estudos de Associação Genética , Predisposição Genética para Doença , Mutagênese Insercional , Animais , Neoplasias da Mama/metabolismo , Neoplasias da Mama/mortalidade , Neoplasias da Mama/patologia , Linhagem Celular Tumoral , Transformação Celular Neoplásica/metabolismo , Biologia Computacional/métodos , Modelos Animais de Doenças , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Camundongos Knockout , Mutação , Prognóstico , Reprodutibilidade dos Testes , Risco , Transdução de Sinais , Análise de Sobrevida , Transcriptoma
17.
Nucleic Acids Res ; 45(D1): D119-D127, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899586

RESUMO

R-loopDB (http://rloop.bii.a-star.edu.sg) was originally constructed as a collection of computationally predicted R-loop forming sequences (RLFSs) in the human genic regions. The renewed R-loopDB provides updates, improvements and new options, including access to recent experimental data. It includes genome-scale prediction of RLFSs for humans, six other animals and yeast. Using the extended quantitative model of RLFSs (QmRLFS), we significantly increased the number of RLFSs predicted in the human genes and identified RLFSs in other organism genomes. R-loopDB allows searching of RLFSs in the genes and in the 2 kb upstream and downstream flanking sequences of any gene. R-loopDB exploits the Ensembl gene annotation system, providing users with chromosome coordinates, sequences, gene and genomic data of the 1 565 795 RLFSs distributed in 121 056 genic or proximal gene regions of the covered organisms. It provides a comprehensive annotation of Ensembl RLFS-positive genes including 93 454 protein coding genes, 12 480 long non-coding RNA and 7 568 small non-coding RNA genes and 7 554 pseudogenes. Using new interface and genome viewers of R-loopDB, users can search the gene(s) in multiple species with keywords in a single query. R-loopDB provides tools to carry out comparative evolution and genome-scale analyses in R-loop biology.


Assuntos
DNA/química , Bases de Dados de Ácidos Nucleicos , Genes , RNA/química , Animais , Genômica , Humanos , Camundongos , Conformação de Ácido Nucleico
18.
Emerg Infect Dis ; 24(9)2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29985788

RESUMO

We sequenced the virus genomes from 3 pregnant women in Thailand with Zika virus diagnoses. All had infections with the Asian lineage. The woman infected at gestational week 9, and not those infected at weeks 20 and 24, had a fetus with microcephaly. Asian lineage Zika viruses can cause microcephaly.


Assuntos
Microcefalia/diagnóstico , Complicações Infecciosas na Gravidez , Infecção por Zika virus , Zika virus/isolamento & purificação , Feminino , Humanos , Recém-Nascido , Microcefalia/etiologia , Gravidez , Primeiro Trimestre da Gravidez , Tailândia , Zika virus/genética
19.
BMC Cancer ; 18(1): 555, 2018 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-29751792

RESUMO

BACKGROUND: Single Nucleotide Polymorphisms (SNPs) can influence patient outcome such as drug response and toxicity after drug intervention. The purpose of this study is to develop a systematic pathway approach to accurately and efficiently predict novel non-synonymous SNPs (nsSNPs) that could be causative to gemcitabine-based chemotherapy treatment outcome in Singaporean non-small cell lung cancer (NSCLC) patients. METHODS: Using a pathway approach that incorporates comprehensive protein-protein interaction data to systematically extend the gemcitabine pharmacologic pathway, we identified 77 related nsSNPs, common in the Singaporean population. After that, we used five computational criteria to prioritize the SNPs based on their importance for protein function. We specifically selected and screened six candidate SNPs in a patient cohort with NSCLC treated with gemcitabine-based chemotherapy. RESULT: We performed survival analysis followed by hematologic toxicity analyses and found that three of six candidate SNPs are significantly correlated with the patient outcome (P < 0.05) i.e. ABCG2 Q141K (rs2231142), SLC29A3 S158F (rs780668) and POLR2A N764K (rs2228130). CONCLUSIONS: Our computational SNP candidate enrichment workflow approach was able to identify several high confidence biomarkers predictive for personalized drug treatment outcome while providing a rationale for a molecular mechanism of the SNP effect. TRIAL REGISTRATION: NCT00695994. Registered 10 June, 2008 'retrospectively registered'.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Biomarcadores Tumorais/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Desoxicitidina/análogos & derivados , Neoplasias Pulmonares/tratamento farmacológico , Adulto , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/mortalidade , Estudos de Coortes , Desoxicitidina/uso terapêutico , Feminino , Genótipo , Técnicas de Genotipagem , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidade , Masculino , Polimorfismo de Nucleotídeo Único , Medicina de Precisão/métodos , Singapura/epidemiologia , Análise de Sobrevida , Resultado do Tratamento , Adulto Jovem , Gencitabina
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa