RESUMEN
INTRODUCTION: The field of forensic DNA analysis has undergone rapid advancements in recent decades. The integration of massively parallel sequencing (MPS) has notably expanded the forensic toolkit, moving beyond identity matching to predicting phenotypic traits and biogeographical ancestry. This shift is of particular significance in cases where conventional DNA profiling fails to identify a single suspect. Supplementing forensic analyses with estimated biological age may be valuable but involves a complex and time-consuming DNA methylation analysis. This study explores and validates the performance of a comprehensive forensic third-generation sequencing assay utilizing Oxford Nanopore Technologies (ONT) in an adaptive and direct sequencing approach. We incorporated the most widely used forensic markers, i.e., STRs, SNPs, InDels, mitochondrial DNA (mtDNA), and two methylation-based clock classifiers, thereby combining forensic genetic and epigenetic analysis in one single workflow. METHODS AND RESULTS: In our investigation, DNA from six anonymous individuals was sequenced using the ONT standard adaptive direct sequencing approach, reaching a mean percentage of on-target reads ranging from 6.6â¯% to 7.7â¯% per sample. ONT data was compared to standard MPS data and Illumina EPIC DNA methylation profiles. Basecalling employed recommended ONT software packages. TREAT was used for ONT-based analysis of autosomal and Y-chromosome STRs, achieving 90-92â¯% correct calls depending on allelic read depth thresholds. InDel analyses for two lower-quality samples proved challenging due to inadequate read depth, while the remaining four samples significantly contributed to the observed percentage markers (60.9â¯%) and correct calls (97.8â¯%). SNP analysis achieved a 98â¯% call rate, with only two mismatches and two missed alleles. ONT-generated DNA methylation data demonstrated Pearson's correlation coefficients with EPIC data ranging from 0.67 to 0.97 for Horvath's clock. Additional age-associated markers exhibited Pearson's correlation coefficients with chronological age between 0.14 (ELOVL2) and 0.96 (FHL2) at read depths of <30 and <20, respectively. Despite excluding mtDNA from our targeted sequencing approach, adaptive proof-reading fragments covered the complete mtDNA with an average read depth of 21-72, showing 100â¯% concordance with reference data. DISCUSSION: Our exploratory study using ONT adaptive sequencing for conventional forensic and age associated DNA methylation markers showed high sequencing accuracy for a significant number of markers, showcasing ONT as a promising (epi)genetic forensic method. Future studies must address three critical aspects: determining clear quantity and quality measures and detection thresholds for accuracy, optimizing input DNA quantity for forensic casework expectations, and addressing ethical considerations associated with phenotype and ancestry analysis to prevent ethnic biases.
RESUMEN
Tandem repeats (TR) play important roles in genomic variation and disease risk in humans. Long-read sequencing allows for the accurate characterization of TRs, however, the underlying bioinformatics perspectives remain challenging. We present otter and TREAT: otter is a fast targeted local assembler, cross-compatible across different sequencing platforms. It is integrated in TREAT, an end-to-end workflow for TR characterization, visualization and analysis across multiple genomes. In a comparison with existing tools based on long-read sequencing data from both Oxford Nanopore Technology (ONT, Simplex and Duplex) and PacBio (Sequel 2 and Revio), otter and TREAT achieved state-of-the-art genotyping and motif characterisation accuracy. Applied to clinically relevant TRs, TREAT/otter significantly identified individuals with pathogenic TR expansions. When applied to a case-control setting, we significantly replicated previously reported associations of TRs with Alzheimer's Disease, including those near or within APOC1 (p=2.63x10-9), SPI1 (p=6.5x10-3) and ABCA7 (p=0.04) genes. We used TREAT/otter to systematically evaluate potential biases when genotyping TRs using diverse ONT and PacBio long-read sequencing datasets. We showed that, in rare cases (0.06%), long-read sequencing suffers from coverage drops in TRs, including the disease-associated TRs in ABCA7 and RFC1 genes. Such coverage drops can lead to TR misgenotyping, hampering the accurate characterization of TR alleles. Taken together, our tools can accurately genotype TR across different sequencing technologies and with minimal requirements, allowing end-to-end analysis and comparisons of TR in human genomes, with broad applications in research and clinical fields.
RESUMEN
BACKGROUND AND OBJECTIVES: More than 200 genetic variants have been associated with multiple sclerosis (MS) susceptibility. However, it is unclear to what extent genetic factors influence lifetime risk of MS. Using a population-based birth-year cohort, we investigate the effect of genetics on lifetime risk of MS. METHODS: In the Project Y study, we tracked down almost all persons with MS (pwMS) from birth year 1966 in the Netherlands. As control participants, we included non-MS participants from the Project Y cohort (born 1965-1967 in the Netherlands) and non-MS participants from the Amsterdam Dementia Cohort born between 1963 and 1969. Genetic variants associated with MS were determined in pwMS and control participants using genotyping or imputation methods. Polygenic risk scores (PRSs) based on variants and weights from the largest genetic study in MS were calculated for each participant and assigned into deciles based on the PRS distribution in the control participants. We examined the lifetime risk for each decile and the association between PRS and MS disease variables, including age at onset and time to secondary progression. RESULTS: MS-PRS was calculated for 285 pwMS (mean age 53.0 ± 0.9 years, 72.3% female) and 267 control participants (mean age 51.8 ± 3.2 years, 58.1% female). Based on the lifetime risk estimation, we observed that 1:2,739 of the women with the lowest 30% genetic risk developed MS, whereas 1:92 of the women with the top 10% highest risk developed MS. For men, only 1:7,900 developed MS in the lowest 30% genetic risk group, compared with 1:293 men with the top 10% genetic risk. The PRS was not significantly associated with age at onset and time to secondary progression in both sexes. DISCUSSION: Our results show that the lifetime risk of MS is strongly influenced by genetic factors. Our findings have the potential to support diagnostic certainty in individuals with suspected MS: a high PRS could strengthen a diagnosis, but especially a PRS from the lowest tail of the PRS distribution should be considered a red flag and could prevent misdiagnosing conditions that mimic MS.
Asunto(s)
Predisposición Genética a la Enfermedad , Herencia Multifactorial , Esclerosis Múltiple , Humanos , Esclerosis Múltiple/genética , Esclerosis Múltiple/epidemiología , Masculino , Femenino , Persona de Mediana Edad , Herencia Multifactorial/genética , Predisposición Genética a la Enfermedad/genética , Países Bajos/epidemiología , Cohorte de Nacimiento , Edad de Inicio , Estudios de Cohortes , Factores de Riesgo , Progresión de la Enfermedad , Puntuación de Riesgo GenéticoRESUMEN
Antibiotic treatments have detrimental effects on the microbiome and lead to antibiotic resistance. To develop a phage therapy against a diverse range of clinically relevant Escherichia coli, we screened a library of 162 wild-type (WT) phages, identifying eight phages with broad coverage of E. coli, complementary binding to bacterial surface receptors, and the capability to stably carry inserted cargo. Selected phages were engineered with tail fibers and CRISPR-Cas machinery to specifically target E. coli. We show that engineered phages target bacteria in biofilms, reduce the emergence of phage-tolerant E. coli and out-compete their ancestral WT phages in coculture experiments. A combination of the four most complementary bacteriophages, called SNIPR001, is well tolerated in both mouse models and minipigs and reduces E. coli load in the mouse gut better than its constituent components separately. SNIPR001 is in clinical development to selectively kill E. coli, which may cause fatal infections in hematological cancer patients.
Asunto(s)
Bacteriófagos , Escherichia coli , Animales , Humanos , Ratones , Porcinos , Escherichia coli/genética , Bacteriófagos/genética , Sistemas CRISPR-Cas/genética , Porcinos Enanos , AntibacterianosRESUMEN
This study reports high-quality genomes of 11 sequence type 111 (ST111) isolates of Pseudomonas aeruginosa. This ST is known for its worldwide dissemination and high capacity to acquire antibiotic resistance mechanisms. This study used long- and short-read sequencing to provide high-quality closed genomes for most of the isolates.
RESUMEN
Journal impact factor (IF) inflation is suggested as a problem resulting from commentaries published by the editors in chief (EiCs) of their respective journals. However, it is unclear whether this is a systemic problem across the top thirty cardiovascular medicine journals. Therefore, the purpose of this investigation was to examine the relationship between the number of commentaries written by an EiC and their journal's IF and Eigenfactor (Ef). Utilizing Spearman rank partial correlations controlling for length of service as the EiC, significant moderate correlations were found between the number of commentaries and the number of first-author commentaries by the EiC and the IF of their journal (r=0.568, p=0.001 and r=0.504, p=0.005; respectively). A weak but still significant correlation was found between the number of commentaries by the EiC and the Ef of their journal (r=0.431, p=0.020). The reason for these correlations is unclear, and whether the methodology used to compute the IF and Ef should be modified needs further research.
Asunto(s)
Factor de Impacto de la Revista , Publicaciones Periódicas como AsuntoRESUMEN
The last decade has witnessed a remarkable increase in our ability to measure genetic information. Advancements of sequencing technologies are challenging the existing methods of data storage and analysis. While methods to cope with the data deluge are progressing, many biologists have lagged behind due to the fast pace of computational advancements and tools available to address their scientific questions. Future generations of biologists must be more computationally aware and capable. This means they should be trained to give them the computational skills to keep pace with technological developments. Here, we propose a model that bridges experimental and bioinformatics concepts using the Oxford Nanopore Technologies (ONT) sequencing platform. We provide both a guide to begin to empower the new generation of educators, scientists, and students in performing long-read assembly of bacterial and bacteriophage genomes and a standalone virtual machine containing all the required software and learning materials for the course.
Asunto(s)
Biología Computacional/educación , Secuenciación de Nanoporos , Humanos , Programas InformáticosRESUMEN
BACKGROUND: The lager brewing yeast, S. pastorianus, is a hybrid between S. cerevisiae and S. eubayanus with extensive chromosome aneuploidy. S. pastorianus is subdivided into Group 1 and Group 2 strains, where Group 2 strains have higher copy number and a larger degree of heterozygosity for S. cerevisiae chromosomes. As a result, Group 2 strains were hypothesized to have emerged from a hybridization event distinct from Group 1 strains. Current genome assemblies of S. pastorianus strains are incomplete and highly fragmented, limiting our ability to investigate their evolutionary history. RESULTS: To fill this gap, we generated a chromosome-level genome assembly of the S. pastorianus strain CBS 1483 from Oxford Nanopore MinION DNA sequencing data and analysed the newly assembled subtelomeric regions and chromosome heterozygosity. To analyse the evolutionary history of S. pastorianus strains, we developed Alpaca: a method to compute sequence similarity between genomes without assuming linear evolution. Alpaca revealed high similarities between the S. cerevisiae subgenomes of Group 1 and 2 strains, and marked differences from sequenced S. cerevisiae strains. CONCLUSIONS: Our findings suggest that Group 1 and Group 2 strains originated from a single hybridization involving a heterozygous S. cerevisiae strain, followed by different evolutionary trajectories. The clear differences between both groups may originate from a severe population bottleneck caused by the isolation of the first pure cultures. Alpaca provides a computationally inexpensive method to analyse evolutionary relationships while considering non-linear evolution such as horizontal gene transfer and sexual reproduction, providing a complementary viewpoint beyond traditional phylogenetic approaches.
Asunto(s)
Genoma Fúngico , Saccharomyces cerevisiae/genética , Saccharomyces/genética , Cerveza , Cromosomas Fúngicos , Haploidia , Secuenciación de Nucleótidos de Alto Rendimiento , Hibridación Genética , Secuenciación de NanoporosRESUMEN
Saccharomyces pastorianus lager-brewing yeasts are domesticated hybrids of S. cerevisiae x S. eubayanus that display extensive inter-strain chromosome copy number variation and chromosomal recombinations. It is unclear to what extent such genome rearrangements are intrinsic to the domestication of hybrid brewing yeasts and whether they contribute to their industrial performance. Here, an allodiploid laboratory hybrid of S. cerevisiae and S. eubayanus was evolved for up to 418 generations on wort under simulated lager-brewing conditions in six independent sequential batch bioreactors. Characterization of 55 single-cell isolates from the evolved cultures showed large phenotypic diversity and whole-genome sequencing revealed a large array of mutations. Frequent loss of heterozygosity involved diverse, strain-specific chromosomal translocations, which differed from those observed in domesticated, aneuploid S. pastorianus brewing strains. In contrast to the extensive aneuploidy of domesticated S. pastorianus strains, the evolved isolates only showed limited (segmental) aneuploidy. Specific mutations could be linked to calcium-dependent flocculation, loss of maltotriose utilization and loss of mitochondrial activity, three industrially relevant traits that also occur in domesticated S. pastorianus strains. This study indicates that fast acquisition of extensive aneuploidy is not required for genetic adaptation of S. cerevisiae × S. eubayanus hybrids to brewing environments. In addition, this work demonstrates that, consistent with the diversity of brewing strains for maltotriose utilization, domestication under brewing conditions can result in loss of this industrially relevant trait. These observations have important implications for the design of strategies to improve industrial performance of novel laboratory-made hybrids.
RESUMEN
Motivation: A long-standing limitation in comparative genomic studies is the dependency on a reference genome, which hinders the spectrum of genetic diversity that can be identified across a population of organisms. This is especially true in the microbial world where genome architectures can significantly vary. There is therefore a need for computational methods that can simultaneously analyze the architectures of multiple genomes without introducing bias from a reference. Results: In this article, we present Ptolemy: a novel method for studying the diversity of genome architectures-such as structural variation and pan-genomes-across a collection of microbial assemblies without the need of a reference. Ptolemy is a 'top-down' approach to compare whole genome assemblies. Genomes are represented as labeled multi-directed graphs-known as quivers-which are then merged into a single, canonical quiver by identifying 'gene anchors' via synteny analysis. The canonical quiver represents an approximate, structural alignment of all genomes in a given collection encoding structural variation across (sub-) populations within the collection. We highlight various applications of Ptolemy by analyzing structural variation and the pan-genomes of different datasets composing of Mycobacterium, Saccharomyces, Escherichia and Shigella species. Our results show that Ptolemy is flexible and can handle both conserved and highly dynamic genome architectures. Ptolemy is user-friendly-requires only FASTA-formatted assembly along with a corresponding GFF-formatted file-and resource-friendly-can align 24 genomes in â¼10 mins with four CPUs and <2 GB of RAM. Availability and implementation: Github: https://github.com/AbeelLab/ptolemy. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Genoma Microbiano , Sintenía , Programas InformáticosRESUMEN
The haploid Saccharomyces cerevisiae strain CEN.PK113-7D is a popular model system for metabolic engineering and systems biology research. Current genome assemblies are based on short-read sequencing data scaffolded based on homology to strain S288C. However, these assemblies contain large sequence gaps, particularly in subtelomeric regions, and the assumption of perfect homology to S288C for scaffolding introduces bias. In this study, we obtained a near-complete genome assembly of CEN.PK113-7D using only Oxford Nanopore Technology's MinION sequencing platform. Fifteen of the 16 chromosomes, the mitochondrial genome and the 2-µm plasmid are assembled in single contigs and all but one chromosome starts or ends in a telomere repeat. This improved genome assembly contains 770 Kbp of added sequence containing 248 gene annotations in comparison to the previous assembly of CEN.PK113-7D. Many of these genes encode functions determining fitness in specific growth conditions and are therefore highly relevant for various industrial applications. Furthermore, we discovered a translocation between chromosomes III and VIII that caused misidentification of a MAL locus in the previous CEN.PK113-7D assembly. This study demonstrates the power of long-read sequencing by providing a high-quality reference assembly and annotation of CEN.PK113-7D and places a caveat on assumed genome stability of microorganisms.
Asunto(s)
Genoma Fúngico , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Nanoporos , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN , Cromosomas Fúngicos , Biología Computacional/métodos , Heterogeneidad Genética , Genómica/métodos , Translocación GenéticaRESUMEN
BACKGROUND.: India is home to 25% of all tuberculosis cases and the second highest number of multidrug resistant cases worldwide. However, little is known about the genetic diversity and resistance determinants of Indian Mycobacterium tuberculosis, particularly for the primary lineages found in India, lineages 1 and 3. METHODS.: We whole genome sequenced 223 randomly selected M. tuberculosis strains from 196 patients within the Tiruvallur and Madurai districts of Tamil Nadu in Southern India. Using comparative genomics, we examined genetic diversity, transmission patterns, and evolution of resistance. RESULTS.: Genomic analyses revealed (11) prevalence of strains from lineages 1 and 3, (11) recent transmission of strains among patients from the same treatment centers, (11) emergence of drug resistance within patients over time, (11) resistance gained in an order typical of strains from different lineages and geographies, (11) underperformance of known resistance-conferring mutations to explain phenotypic resistance in Indian strains relative to studies focused on other geographies, and (11) the possibility that resistance arose through mutations not previously implicated in resistance, or through infections with multiple strains that confound genotype-based prediction of resistance. CONCLUSIONS.: In addition to substantially expanding the genomic perspectives of lineages 1 and 3, sequencing and analysis of M. tuberculosis whole genomes from Southern India highlight challenges of infection control and rapid diagnosis of resistant tuberculosis using current technologies. Further studies are needed to fully explore the complement of diversity and resistance determinants within endemic M. tuberculosis populations.
Asunto(s)
Farmacorresistencia Bacteriana Múltiple/genética , Genoma Bacteriano , Mycobacterium tuberculosis/genética , Tuberculosis/diagnóstico , Tuberculosis/microbiología , Adulto , Antituberculosos/farmacología , Secuencia de Bases , Femenino , Variación Genética , Humanos , India/epidemiología , Masculino , Mutación , Mycobacterium tuberculosis/clasificación , Mycobacterium tuberculosis/efectos de los fármacos , Filogenia , Reacción en Cadena de la Polimerasa , Tuberculosis/epidemiología , Tuberculosis/transmisiónRESUMEN
Multidrug-resistant tuberculosis (MDR-TB), caused by drug-resistant strains of Mycobacterium tuberculosis, is an increasingly serious problem worldwide. Here we examined a data set of whole-genome sequences from 5,310 M. tuberculosis isolates from five continents. Despite the great diversity of these isolates with respect to geographical point of isolation, genetic background and drug resistance, the patterns for the emergence of drug resistance were conserved globally. We have identified harbinger mutations that often precede multidrug resistance. In particular, the katG mutation encoding p.Ser315Thr, which confers resistance to isoniazid, overwhelmingly arose before mutations that conferred rifampicin resistance across all of the lineages, geographical regions and time periods. Therefore, molecular diagnostics that include markers for rifampicin resistance alone will be insufficient to identify pre-MDR strains. Incorporating knowledge of polymorphisms that occur before the emergence of multidrug resistance, particularly katG p.Ser315Thr, into molecular diagnostics should enable targeted treatment of patients with pre-MDR-TB to prevent further development of MDR-TB.
Asunto(s)
Farmacorresistencia Bacteriana Múltiple/genética , Mycobacterium tuberculosis/genética , Tuberculosis Resistente a Múltiples Medicamentos/genética , Antituberculosos/uso terapéutico , Proteínas Bacterianas/genética , Catalasa/genética , Genómica/métodos , Humanos , Isoniazida/uso terapéutico , Mutación/genética , Mycobacterium tuberculosis/efectos de los fármacos , Polimorfismo Genético/genética , Rifampin/uso terapéutico , Tuberculosis Resistente a Múltiples Medicamentos/tratamiento farmacológicoRESUMEN
A more complete understanding of the genetic basis of drug resistance in Mycobacterium tuberculosis is critical for prompt diagnosis and optimal treatment, particularly for toxic second-line drugs such as D-cycloserine. Here we used the whole-genome sequences from 498 strains of M. tuberculosis to identify new resistance-conferring genotypes. By combining association and correlated evolution tests with strategies for amplifying signal from rare variants, we found that loss-of-function mutations in ald (Rv2780), encoding L-alanine dehydrogenase, were associated with unexplained drug resistance. Convergent evolution of this loss of function was observed exclusively among multidrug-resistant strains. Drug susceptibility testing established that ald loss of function conferred resistance to D-cycloserine, and susceptibility to the drug was partially restored by complementation of ald. Clinical strains with mutations in ald and alr exhibited increased resistance to D-cycloserine when cultured in vitro. Incorporation of D-cycloserine resistance in novel molecular diagnostics could allow for targeted use of this toxic drug among patients with susceptible infections.