RESUMO
Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines. This panSV genome, along with 14 new reference assemblies, revealed large-scale intermixing of diverse genotypes, as well as thousands of SVs intersecting genes and cis-regulatory regions. Hundreds of SV-gene pairs exhibit subtle and significant expression changes, which could broadly influence quantitative trait variation. By combining quantitative genetics with genome editing, we show how multiple SVs that changed gene dosage and expression levels modified fruit flavor, size, and production. In the last example, higher order epistasis among four SVs affecting three related transcription factors allowed introduction of an important harvesting trait in modern tomato. Our findings highlight the underexplored role of SVs in genotype-to-phenotype relationships and their widespread importance and utility in crop improvement.
Assuntos
Produtos Agrícolas/genética , Regulação da Expressão Gênica de Plantas , Variação Estrutural do Genoma , Solanum lycopersicum/genética , Alelos , Sistema Enzimático do Citocromo P-450/genética , Ecótipo , Epistasia Genética , Frutas/genética , Duplicação Gênica , Genoma de Planta , Genótipo , Endogamia , Anotação de Sequência Molecular , Fenótipo , Melhoramento Vegetal , Locos de Características Quantitativas/genéticaRESUMO
The highly diverse Solanaceae family contains several widely studied models and crop species. Fully exploring, appreciating, and exploiting this diversity requires additional model systems. Particularly promising are orphan fruit crops in the genus Physalis, which occupy a key evolutionary position in the Solanaceae and capture understudied variation in traits such as inflorescence complexity, fruit ripening and metabolites, disease and insect resistance, self-compatibility, and most notable, the striking inflated calyx syndrome (ICS), an evolutionary novelty found across angiosperms where sepals grow exceptionally large to encapsulate fruits in a protective husk. We recently developed transformation and genome editing in Physalis grisea (groundcherry). However, to systematically explore and unlock the potential of this and related Physalis as genetic systems, high-quality genome assemblies are needed. Here, we present chromosome-scale references for P. grisea and its close relative Physalis pruinosa and use these resources to study natural and engineered variations in floral traits. We first rapidly identified a natural structural variant in a bHLH gene that causes petal color variation. Further, and against expectations, we found that CRISPR-Cas9-targeted mutagenesis of 11 MADS-box genes, including purported essential regulators of ICS, had no effect on inflation. In a forward genetics screen, we identified huskless, which lacks ICS due to mutation of an AP2-like gene that causes sepals and petals to merge into a single whorl of mixed identity. These resources and findings elevate Physalis to a new Solanaceae model system and establish a paradigm in the search for factors driving ICS.
Assuntos
Physalis , Solanaceae , Solanaceae/genética , Physalis/genética , Physalis/metabolismo , Evolução Biológica , Mutação , Edição de GenesRESUMO
Klebsiella pneumoniae (Kp) is an important cause of healthcare-associated infections, which increases patient morbidity, mortality, and hospitalization costs. Gut colonization by Kp is consistently associated with subsequent Kp disease, and patients are predominantly infected with their colonizing strain. Our previous comparative genomics study, between disease-causing and asymptomatically colonizing Kp isolates, identified a plasmid-encoded tellurite (TeO3-2)-resistance (ter) operon as strongly associated with infection. However, TeO3-2 is extremely rare and toxic to humans. Thus, we used a multidisciplinary approach to determine the biological link between ter and Kp infection. First, we used a genomic and bioinformatic approach to extensively characterize Kp plasmids encoding the ter locus. These plasmids displayed substantial variation in plasmid incompatibility type and gene content. Moreover, the ter operon was genetically independent of other plasmid-encoded virulence and antibiotic resistance loci, both in our original patient cohort and in a large set (n = 88) of publicly available ter operon-encoding Kp plasmids, indicating that the ter operon is likely playing a direct, but yet undescribed role in Kp disease. Next, we employed multiple mouse models of infection and colonization to show that 1) the ter operon is dispensable during bacteremia, 2) the ter operon enhances fitness in the gut, 3) this phenotype is dependent on the colony of origin of mice, and 4) antibiotic disruption of the gut microbiota eliminates the requirement for ter. Furthermore, using 16S rRNA gene sequencing, we show that the ter operon enhances Kp fitness in the gut in the presence of specific indigenous microbiota, including those predicted to produce short chain fatty acids. Finally, administration of exogenous short-chain fatty acids in our mouse model of colonization was sufficient to reduce fitness of a ter mutant. These findings indicate that the ter operon, strongly associated with human infection, encodes factors that resist stress induced by the indigenous gut microbiota during colonization. This work represents a substantial advancement in our molecular understanding of Kp pathogenesis and gut colonization, directly relevant to Kp disease in healthcare settings.
Assuntos
Microbioma Gastrointestinal/genética , Intestinos/microbiologia , Klebsiella/genética , Plasmídeos/genética , Animais , Bacteriemia/genética , Proteínas de Bactérias/genética , Feminino , Aptidão Genética/fisiologia , Loci Gênicos/fisiologia , Genoma Bacteriano , Interações Hospedeiro-Patógeno/genética , Resistência a Canamicina/genética , Infecções por Klebsiella/microbiologia , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Óperon/genética , Especificidade de Órgãos/genética , Virulência/genética , beta-Lactamases/genéticaRESUMO
BACKGROUND: Repeated coronavirus disease 2019 (COVID-19) molecular testing can lead to positive test results after negative results and to multiple positive results over time. The association between positive test results and infectious virus is important to quantify. METHODS: A 2-month cohort of retrospective data and consecutively collected specimens from patients with COVID-19 or patients under investigation were used to understand the correlation between prolonged viral RNA positive test results, cycle threshold (Ct) values and growth of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in cell culture. Whole-genome sequencing was used to confirm virus genotype in patients with prolonged viral RNA detection. Droplet digital polymerase chain reaction was used to assess the rate of false-negative COVID-19 diagnostic test results. RESULTS: In 2 months, 29 686 specimens were tested and 2194 patients underwent repeated testing. Virus recovery in cell culture was noted in specimens with a mean Ct value of 18.8 (3.4) for SARS-CoV-2 target genes. Prolonged viral RNA shedding was associated with positive virus growth in culture in specimens collected up to 21 days after the first positive result but mostly in individuals symptomatic at the time of sample collection. Whole-genome sequencing provided evidence the same virus was carried over time. Positive test results following negative results had Ct values >29.5 and were not associated with virus culture. Droplet digital polymerase chain reaction results were positive in 5.6% of negative specimens collected from patients with confirmed or clinically suspected COVID-19. CONCLUSIONS: Low Ct values in SARS-CoV-2 diagnostic tests were associated with virus growth in cell culture. Symptomatic patients with prolonged viral RNA shedding can also be infectious.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , RNA Viral/genética , Estudos Retrospectivos , Eliminação de Partículas ViraisRESUMO
Triboluminescence (TL) is shown to enable selective detection of trace crystallinity within nominally amorphous solid dispersions (ASDs). ASDs are increasingly used for the preparation of pharmaceutical formulations, the physical stability of which can be negatively impacted by trace crystallinity introduced during manufacturing or storage. In the present study, TL measurements of a model ASD consisting of griseofulvin in polyethylene glycol produced limits of detection of 140 ppm. Separate studies of the particle size dependence of sucrose crystals and the dependence on polymorphism in clopidogrel bisulfate particles are both consistent with a mechanism for TL closely linked to the piezoelectric response of the crystalline fraction. Whereas disordered polymeric materials cannot support piezoelectric activity, molecular crystals produced from homochiral molecules adopt crystal structures that are overwhelmingly symmetry-allowed for piezoelectricity. Consequently, TL may provide a broadly applicable and simple experimental route for sensitive detection of trace crystallinity within nominally amorphous materials.
Assuntos
Composição de Medicamentos , Medições Luminescentes , Preparações Farmacêuticas/análise , Medições Luminescentes/instrumentaçãoRESUMO
Second harmonic generation (SHG) was integrated with Raman spectroscopy for the analysis of pharmaceutical materials. Particulate formulations of clopidogrel bisulfate were prepared in two crystal forms (Form I and Form II). Image analysis approaches enable automated identification of particles by bright field imaging, followed by classification by SHG. Quantitative SHG microscopy enabled discrimination of crystal form on a per particle basis with 99.95% confidence in a total measurement time of â¼10 ms per particle. Complementary measurements by Raman and synchrotron XRD are in excellent agreement with the classifications made by SHG, with measurement times of â¼1 min and several seconds per particle, respectively. Coupling these capabilities with at-line monitoring may enable real-time feedback for reaction monitoring during pharmaceutical production to favor the more bioavailable but metastable Form I with limits of detection in the ppm regime.
RESUMO
An enduring question in evolutionary biology concerns the degree to which episodes of convergent trait evolution depend on the same genetic programs, particularly over long timescales. In this work, we genetically dissected repeated origins and losses of prickles-sharp epidermal projections-that convergently evolved in numerous plant lineages. Mutations in a cytokinin hormone biosynthetic gene caused at least 16 independent losses of prickles in eggplants and wild relatives in the genus Solanum. Homologs underlie prickle formation across angiosperms that collectively diverged more than 150 million years ago, including rice and roses. By developing new Solanum genetic systems, we leveraged this discovery to eliminate prickles in a wild species and an indigenously foraged berry. Our findings implicate a shared hormone activation genetic program underlying evolutionarily widespread and recurrent instances of plant morphological innovation.
Assuntos
Evolução Biológica , Citocininas , Genes de Plantas , Epiderme Vegetal , Solanum , Citocininas/biossíntese , Citocininas/genética , Evolução Molecular , Mutação , Oryza/genética , Filogenia , Epiderme Vegetal/anatomia & histologia , Epiderme Vegetal/genética , Solanum/anatomia & histologia , Solanum/genéticaRESUMO
Objectives: Carbapenem-resistant Enterobacterales (CRE) are an urgent public health threat. A better understanding of the molecular epidemiology and transmission dynamics of CRE is necessary to limit their dissemination within healthcare settings. We sought to investigate the mechanisms of resistance and spread of CRE within multiple hospitals in Maryland. Methods: From 2016 to 2018, all CRE were collected from any specimen source from The Johns Hopkins Medical Institutions. The isolates were further characterized using both phenotypic and genotypic approaches, including short- and/or long-read WGS. Results: From 2016 to 2018, 302 of 40â908 (0.7%) unique Enterobacterales isolates were identified as CRE. Of CRE, 142 (47%) were carbapenemase-producing CRE with KPC (80.3%) predominating among various genera. Significant genetic diversity was identified among all CRE with high-risk clones serving as major drivers of clonal clusters. Further, we found the predominance of pUVA-like plasmids, with a subset harbouring resistance genes to environmental cleaning agents, involved in intergenus dissemination of blaKPC genes. Conclusions: Our findings provide valuable data to understand the transmission dynamics of all CRE within the greater Maryland region. These data can help guide targeted interventions to limit CRE transmission in healthcare facilities.
RESUMO
We present the genome of the living fossil, Wollemia nobilis, a southern hemisphere conifer morphologically unchanged since the Cretaceous. Presumed extinct until rediscovery in 1994, the Wollemi pine is critically endangered with less than 60 wild adults threatened by intensifying bushfires in the Blue Mountains of Australia. The 12 Gb genome is among the most contiguous large plant genomes assembled, with extremely low heterozygosity and unusual abundance of DNA transposons. Reduced representation and genome re-sequencing of individuals confirms a relictual population since the last major glacial/drying period in Australia, 120 ky BP. Small RNA and methylome sequencing reveal conservation of ancient silencing mechanisms despite the presence of thousands of active and abundant transposons, including some transferred horizontally to conifers from arthropods in the Jurassic. A retrotransposon burst 8-6 my BP coincided with population decline, possibly as an adaptation enhancing epigenetic diversity. Wollemia, like other conifers, is susceptible to Phytophthora, and a suite of defense genes, similar to those in loblolly pine, are targeted for silencing by sRNAs in leaves. The genome provides insight into the earliest seed plants, while enabling conservation efforts.
RESUMO
In this paper, continuous crystallization of Atorvastatin calcium (ASC) using a continuous oscillatory baffled crystallizer (COBC) has been investigated. Like most API manufacturing, ASC is manufactured batchwise and the pure API is recovered via batch combined cooling and antisolvent crystallization (CCAC) process, which has the challenges of low productivity, wide crystal size distribution (CSD) and sometimes polymorphic form contamination. To overcome the limitations of the batch crystallization, continuous crystallization of ASC was studied in a NiTech (United Kingdom) DN15 COBC, manufactured by Alconbury Weston Ltd. (AWL, United Kingdom), with the aim to improve productivity and CSD of the desired polymorph. The COBC has the advantage of high heat transfer rates and improved mixing that significantly reduces the crystallization time. It also has the advantage of spatial temperature distribution and multiple addition ports to control supersaturation and hence the crystallization process. This work uses an array of process analytical technology (PAT) tools to assess key process parameters that affect the polymorphic outcome and CSD. Two parameters were found to have significant impact on the polymorph, they are ratio of solvent to antisolvent at the point of mixing of the two streams and presence of seeds. The splitting of antisolvent into two addition ports in the COBC was found to give the desired form. The CCAC of ASC in COBC was found to be -30-fold more productive than the batch CCAC process. The cycle time for generating 100 g of desired polymorphic form of ASC also significantly reduced from 22 h in batch process to 12 min in the COBC. The crystals obtained using a CCAC process in a COBC had a narrower CSD compared to that from a batch crystallization process.
Assuntos
Cristalização , Atorvastatina , Transição de Fase , Solventes/química , Reino Unido , Tamanho da PartículaRESUMO
There are many short-read variant-calling tools, with different strengths and weaknesses. We present a tool, Minos, which combines outputs from arbitrary variant callers, increasing recall without loss of precision. We benchmark on 62 samples from three bacterial species and an outbreak of 385 Mycobacterium tuberculosis samples. Minos also enables joint genotyping; we demonstrate on a large (N=13k) M. tuberculosis cohort, building a map of non-synonymous SNPs and indels in a region where all such variants are assumed to cause rifampicin resistance. We quantify the correlation with phenotypic resistance and then replicate in a second cohort (N=10k).
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mycobacterium tuberculosis , Genoma Bacteriano , Genótipo , Humanos , Mutação INDEL , Mycobacterium tuberculosis/genética , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The early COVID-19 pandemic was characterized by rapid global spread. In Maryland and Washington, DC, United States, more than 2500 cases were reported within 3 weeks of the first COVID-19 detection in March 2020. We aimed to use genomic sequencing to understand the initial spread of SARS-CoV-2 - the virus that causes COVID-19 - in the region. We analyzed 620 samples collected from the Johns Hopkins Health System during March 11-31, 2020, comprising 28.6% of the total cases in Maryland and Washington, DC. From these samples, we generated 114 complete viral genomes. Analysis of these genomes alongside a subsampling of over 1000 previously published sequences showed that the diversity in this region rivaled global SARS-CoV-2 genetic diversity at that time and that the sequences belong to all of the major globally circulating lineages, suggesting multiple introductions into the region. We also analyzed these regional SARS-CoV-2 genomes alongside detailed clinical metadata and found that clinically severe cases had viral genomes belonging to all major viral lineages. We conclude that efforts to control local spread of the virus were likely confounded by the number of introductions into the region early in the epidemic and the interconnectedness of the region as a whole.
Assuntos
COVID-19/virologia , Genoma Viral , Pandemias , Filogenia , SARS-CoV-2/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Baltimore , Sequência de Bases , COVID-19/epidemiologia , COVID-19/transmissão , Criança , Surtos de Doenças , Transmissão de Doença Infecciosa , District of Columbia , Feminino , Genômica/métodos , Saúde Global , Humanos , Masculino , Pessoa de Meia-Idade , Adulto JovemRESUMO
BACKGROUND: The early COVID-19 pandemic has been characterized by rapid global spread. In the United States National Capital Region, over 2,000 cases were reported within three weeks of its first detection in March 2020. We aimed to use genomic sequencing to understand the initial spread of SARS-CoV-2, the virus that causes COVID-19, in the region. By correlating genetic information to disease phenotype, we also aimed to gain insight into any correlation between viral genotype and case severity or transmissibility. METHODS: We performed whole genome sequencing of clinical SARS-CoV-2 samples collected in March 2020 by the Johns Hopkins Health System. We analyzed these regional SARS-CoV-2 genomes alongside detailed clinical metadata and the global phylogeny to understand early establishment of the virus within the region. RESULTS: We analyzed 620 samples from the Johns Hopkins Health System collected between March 11-31, 2020, comprising 37.3% of the total cases in Maryland during this period. We selected 143 of these samples for sequencing, generating 114 complete viral genomes. These genomes belong to all five major Nextstrain-defined clades, suggesting multiple introductions into the region and underscoring the diversity of the regional epidemic. We also found that clinically severe cases had genomes belonging to all of these clades. CONCLUSIONS: We established a pipeline for SARS-CoV-2 sequencing within the Johns Hopkins Health system, which enabled us to capture the significant viral diversity present in the region as early as March 2020. Efforts to control local spread of the virus were likely confounded by the number of introductions into the region early in the epidemic and interconnectedness of the region as a whole.
RESUMO
We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO .
Assuntos
Genômica/métodos , Software , Arabidopsis/genética , Genoma de Planta , Variação Estrutural do Genoma , Solanum lycopersicum/genéticaRESUMO
Domestication of clonally propagated crops such as pineapple from South America was hypothesized to be a 'one-step operation'. We sequenced the genome of Ananas comosus var. bracteatus CB5 and assembled 513 Mb into 25 chromosomes with 29,412 genes. Comparison of the genomes of CB5, F153 and MD2 elucidated the genomic basis of fiber production, color formation, sugar accumulation and fruit maturation. We also resequenced 89 Ananas genomes. Cultivars 'Smooth Cayenne' and 'Queen' exhibited ancient and recent admixture, while 'Singapore Spanish' supported a one-step operation of domestication. We identified 25 selective sweeps, including a strong sweep containing a pair of tandemly duplicated bromelain inhibitors. Four candidate genes for self-incompatibility were linked in F153, but were not functional in self-compatible CB5. Our findings support the coexistence of sexual recombination and a one-step operation in the domestication of clonally propagated crops. This work guides the exploration of sexual and asexual domestication trajectories in other clonally propagated crops.
Assuntos
Ananas/genética , Produtos Agrícolas/genética , Domesticação , Genoma de Planta , Proteínas de Plantas/genética , Plantas Geneticamente Modificadas/genética , Característica Quantitativa Herdável , Ananas/crescimento & desenvolvimento , Bromelaínas/metabolismo , Produtos Agrícolas/crescimento & desenvolvimento , Regulação da Expressão Gênica de Plantas , Fenótipo , Plantas Geneticamente Modificadas/crescimento & desenvolvimento , Dinâmica Populacional , Açúcares/metabolismoRESUMO
Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper (Capsicum annuum) genome with a single Linked-Read library. Plant genomes, including pepper, are characterized by long, highly similar repetitive sequences. Accordingly, significant effort is used to ensure that the sequenced plant is highly homozygous and the resulting assembly is a haploid consensus. With a phased assembly approach, we targeted a heterozygous F1 derived from a wide cross to assess the ability to derive both haplotypes and characterize a pungency gene with a large insertion/deletion. The Supernova software generated a highly ordered, more contiguous sequence assembly than all currently available C. annuum reference genomes. Over 83% of the final assembly was anchored and oriented using four publicly available de novo linkage maps. A comparison of the annotation of conserved eukaryotic genes indicated the completeness of assembly. The validity of the phased assembly is further demonstrated with the complete recovery of both 2.5-Kb insertion/deletion haplotypes of the PUN1 locus in the F1 sample that represents pungent and nonpungent peppers, as well as nearly full recovery of the BUSCO2 gene set within each of the two haplotypes. The most contiguous pepper genome assembly to date has been generated which demonstrates that Linked-Read library technology provides a tool to de novo assemble complex highly repetitive heterozygous plant genomes. This technology can provide an opportunity to cost-effectively develop high-quality genome assemblies for other complex plants and compare structural and gene differences through accurate haplotype reconstruction.
RESUMO
BACKGROUND: Determining interacting SNPs in genome-wide association studies is computationally expensive yet of considerable interest in genomics. FINDINGS: We present a program Chi8 that calculates the Chi-square 8 degree of freedom test between all pairs of SNPs in a brute force manner on a Graphics Processing Unit. We analyze each of the seven WTCCC genome-wide association studies that have about 5000 total case and controls and 400,000 SNPs in an average of 9.6 h on a single GPU. We also study the power, false positives, and area under curve of our program on simulated data and provide a comparison to the GBOOST program. Our program source code is freely available from http://www.cs.njit.edu/usman/Chi8.