RESUMO
PREMISE: Whole-genome duplications (WGDs) are prevalent throughout the evolutionary history of plants. For example, dozens of WGDs have been phylogenetically localized across the order Brassicales, specifically, within the family Brassicaceae. A WGD event has also been identified in the Cleomaceae, the sister family to Brassicaceae, yet its placement, as well as that of WGDs in other families in the order, remains unclear. METHODS: Phylo-transcriptomic data were generated and used to infer a nuclear phylogeny for 74 Brassicales taxa. Genome survey sequencing was also performed on 66 of those taxa to infer a chloroplast phylogeny. These phylogenies were used to assess and confirm relationships among the major families of the Brassicales and within Brassicaceae. Multiple WGD inference methods were then used to assess the placement of WGDs on the nuclear phylogeny. RESULTS: Well-supported chloroplast and nuclear phylogenies for the Brassicales and the putative placement of the Cleomaceae-specific WGD event Th-É are presented. This work also provides evidence for previously hypothesized WGDs, including a well-supported event shared by at least two members of the Resedaceae family, and a possible event within the Capparaceae. CONCLUSIONS: Phylogenetics and the placement of WGDs within highly polyploid lineages continues to be a major challenge. This study adds to the conversation on WGD inference difficulties by demonstrating that sampling is especially important for WGD identification and phylogenetic placement. Given its economic importance and genomic resources, the Brassicales continues to be an ideal group for assessing WGD inference methods.
Assuntos
Duplicação Gênica , Magnoliopsida/genética , Evolução Molecular , Genoma , Genoma de Planta/genética , Humanos , Filogenia , PoliploidiaRESUMO
Intermittent hypoxia (IH) is a hallmark of obstructive sleep apnea (OSA) and induces metabolic dysfunction manifesting as inflammation, increased lipolysis and insulin resistance in visceral white adipose tissues (vWAT). However, the cell types and their corresponding transcriptional pathways underlying these functional perturbations are unknown. Here, we applied single nucleus RNA sequencing (snRNA-seq) coupled with aggregate RNA-seq methods to evaluate the cellular heterogeneity in vWAT following IH exposures mimicking OSA. C57BL/6 male mice were exposed to IH and room air (RA) for 6 weeks, and nuclei from vWAT were isolated and processed for snRNA-seq followed by differential expressed gene (DEGs) analyses by cell type, along with gene ontology and canonical pathways enrichment tests of significance. IH induced significant transcriptional changes compared to RA across 14 different cell types identified in vWAT. We identified cell-specific signature markers, transcriptional networks, metabolic signaling pathways, and cellular subpopulation enrichment in vWAT. Globally, we also identify 298 common regulated genes across multiple cellular types that are associated with metabolic pathways. Deconvolution of cell types in vWAT using global RNA-seq revealed that distinct adipocytes appear to be differentially implicated in key aspects of metabolic dysfunction. Thus, the heterogeneity of vWAT and its response to IH at the cellular level provides important insights into the metabolic morbidity of OSA and may possibly translate into therapeutic targets.
Assuntos
Adipócitos/metabolismo , Perfilação da Expressão Gênica , Hipóxia/metabolismo , Gordura Intra-Abdominal/metabolismo , Transcriptoma , Animais , Biologia Computacional/métodos , Regulação da Expressão Gênica , Ontologia Genética , Sequenciamento de Nucleotídeos em Larga Escala , Camundongos , Anotação de Sequência Molecular , Pequeno RNA não Traduzido , Análise de Célula ÚnicaAssuntos
COVID-19 , SARS-CoV-2 , Ciência de Dados , Genômica , Humanos , Missouri , Poder PsicológicoRESUMO
Systematic evolution of ligands through exponential enrichment (SELEX) is a well-established method for generating nucleic acid populations that are enriched for specified functions. High-throughput sequencing (HTS) enhances the power of comparative sequence analysis to reveal details of how RNAs within these populations recognize their targets. We used HTS analysis to evaluate RNA populations selected to bind type I human immunodeficiency virus reverse transcriptase (RT). The populations are enriched in RNAs of independent lineages that converge on shared motifs and in clusters of RNAs with nearly identical sequences that share common ancestry. Both of these features informed inferences of the secondary structures of enriched RNAs, their minimal structural requirements and their stabilities in RT-aptamer complexes. Monitoring population dynamics in response to increasing selection pressure revealed RNA inhibitors of RT that are more potent than the previously identified pseudoknots. Improved potency was observed for inhibition of both purified RT in enzymatic assays and viral replication in cell-based assays. Structural and functional details of converged motifs that are obscured by simple consensus descriptions are also revealed by the HTS analysis. The approach presented here can readily be generalized for the efficient and systematic post-SELEX development of aptamers for down-stream applications.
Assuntos
Fármacos Anti-HIV/química , Aptâmeros de Nucleotídeos/química , Transcriptase Reversa do HIV/antagonistas & inibidores , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Inibidores da Transcriptase Reversa/química , Análise de Sequência de RNA/métodos , Fármacos Anti-HIV/farmacologia , Aptâmeros de Nucleotídeos/farmacologia , Sequência de Bases , Sequência Consenso , HIV-1/efeitos dos fármacos , HIV-1/fisiologia , Motivos de Nucleotídeos , Inibidores da Transcriptase Reversa/farmacologia , Técnica de Seleção de Aptâmeros , Replicação Viral/efeitos dos fármacosRESUMO
Polymeric chemically amplified resists (CARs) are critical materials for high-throughput lithographic processes. A photoactivated acid-anion catalyst changes the polymer's solubility via a deprotection reaction, which enables pattern development through selective dissolution. To capture observed reaction kinetics, reaction-diffusion models employ a catalyst diffusivity that is accelerated by reaction. However, the microscopic origin and factors contributing to this phenomena remain unclear. Herein, we employ detailed atomistic molecular dynamics simulations to examine the impact of protecting group removal and material relaxation on catalyst mobility. We report data on polymer density, catalyst dispersion, excess free volume, and segmental dynamics with increasing time/extent of deprotection. We then propose simple kinetic Monte Carlo algorithms that can describe both molecular dynamics simulations of deprotection reactions and experimental data.
Assuntos
Simulação de Dinâmica Molecular , Polímeros , Difusão , Cinética , Método de Monte Carlo , Polímeros/químicaRESUMO
BACKGROUND: An introgression library is a family of near-isogenic lines in a common genetic background, each of which carries one or more genomic regions contributed by a donor genome. Near-isogenic lines are powerful genetic resources for the analysis of phenotypic variation and are important for map-base cloning genes underlying mutations and traits. With many thousands of distinct genotypes, querying introgression libraries for lines of interest is an issue. RESULTS: We have created IView, a tool to graphically display and query near-isogenic line libraries for specific introgressions. This tool incorporates a web interface for displaying the location and extent of introgressions. Each genetic marker is associated with a position on a reference map. Users can search for introgressions using marker names, or chromosome number and map positions. This search results in a display of lines carrying an introgression at the specified position. Upon selecting one of the lines, color-coded introgressions on all chromosomes of the line are displayed graphically.The source code for IView can be downloaded from http://xrl.us/iview. CONCLUSIONS: IView will be useful for those wanting to make introgression data from their stock of germplasm searchable.
Assuntos
Biblioteca Genômica , Genômica/métodos , Software , Marcadores Genéticos/genética , Genótipo , Fenótipo , Locos de Características Quantitativas , Interface Usuário-ComputadorRESUMO
BACKGROUND: Biological chemistry is very stereospecific. Nonetheless, the diastereotopic oxygen atoms of diphosphate-containing molecules in the Protein Data Bank (PDB) are often given names that do not uniquely distinguish them from each other due to the lack of standardization. This issue has largely not been addressed by the protein structure community. RESULTS: Of 472 diastereotopic atom pairs studied from the PDB, 118 were found to have names that are not uniquely assigned. Among the molecules identified with these inconsistencies were many cofactors of enzymatic processes such as mononucleotides (e.g. ADP, ATP, GTP), dinucleotide cofactors (e.g. FAD, NAD), and coenzyme A. There were no overall trends in naming conventions, though ligand-specific trends were prominent. CONCLUSION: The lack of standardized naming conventions for diastereotopic atoms of small molecules has left the ad hoc names assigned to many of these atoms non-unique, which may create problems in data-mining of the PDB. We suggest a naming convention to resolve this issue. The in-house software used in this study is available upon request.A version of the software used for the analyses described in this paper is available at our web site: http://digbio.missouri.edu/ddan/DDAN.htm.
Assuntos
Algoritmos , Bases de Dados de Proteínas , Difosfatos/química , Difosfatos/classificação , Armazenamento e Recuperação da Informação/métodos , Proteínas/química , Proteínas/classificação , Software , Terminologia como Assunto , Sistemas de Gerenciamento de Base de Dados , EstereoisomerismoRESUMO
Protein-bound water molecules are important components of protein structure, and therefore, protein function and energetics. Although structural conservation of solvent has been studied in a few protein families, a lack of suitable computational tools has hindered more comprehensive analyses. Herein we present a semiautomated computational approach for identifying solvent sites that are conserved among proteins sharing a common three-dimensional structure. This method is tested on six protein families: (1) monodomain cytochrome c, (2) fatty-acid binding protein, (3) lactate/malate dehydrogenase, (4) parvalbumin, (5) phospholipase A2, and (6) serine protease. For each family, the method successfully identified previously known conserved solvent sites. Moreover, the method discovered 22 novel conserved solvent sites, some of which have higher degrees of conservation than the previously known sites. All six families studied had solvent sites with more than 90% conservation and these sites were invariably located in regions of the protein with very high sequence conservation. These results suggest that highly conserved solvent sites, by virtue of their proximity to conserved residues, should be considered as one of the defining three-dimensional structural characteristics of protein families and folds.
Assuntos
Proteômica/métodos , Animais , Automação , Sítios de Ligação , Simulação por Computador , Humanos , Ligantes , Família Multigênica , Conformação Proteica , Dobramento de Proteína , Software , Solventes/química , Água/químicaRESUMO
Teosinte ( subsp. H. H. Iltis & Doebley) has greater genetic diversity than maize inbreds and landraces ( subsp. ). There are, however, limited genetic resources to efficiently evaluate and tap this diversity. To broaden resources for genetic diversity studies in maize, we developed and evaluated 928 near-isogenic introgression lines (NILs) from 10 teosinte accessions in the B73 background. Joint linkage analysis of the 10 introgression populations identified several large-effect quantitative trait loci (QTL) for days to anthesis (DTA), kernel row number (KRN), and 50-kernel weight (Wt50k). Our results confirm prior reports of kernel domestication loci and identify previously uncharacterized QTL with a range of allelic effects enabling future research into the genetic basis of these traits. Additionally, we used a targeted set of NILs to validate the effects of a KRN QTL located on chromosome 2. These introgression populations offer novel tools for QTL discovery and validation as well as a platform for initiating fine mapping.
Assuntos
Alelos , Zea mays/genética , Mapeamento Cromossômico , Ligação Genética , Variação Genética , Locos de Características QuantitativasRESUMO
A computational comparison of 102 high-resolution (=1.90 A) enzyme-dinucleotide (NAD, NADP, FAD) complexes was performed to investigate the role of solvent in dinucleotide recognition by Rossmann fold domains. The typical binding site contains about 9-12 water molecules, and about 30% of the hydrogen bonds between the protein and the dinucleotide are water mediated. Detailed inspection of the structures reveals a structurally conserved water molecule bridging dinucleotides with the well-known glycine-rich phosphate-binding loop. This water molecule displays a conserved hydrogen-bonding pattern. It forms hydrogen bonds to the dinucleotide pyrophosphate, two of the three conserved glycine residues of the phosphate-binding loop, and a residue at the C-terminus of strand four of the Rossmann fold. The conserved water molecule is also present in high-resolution structures of apo enzymes. However, the water molecule is not present in structures displaying significant deviations from the classic Rossmann fold motif, such as having nonstandard topology, containing a very short phosphate-binding loop, or having alpha-helix "A" oriented perpendicular to the beta-sheet. Thus, the conserved water molecule appears to be an inherent structural feature of the classic Rossmann dinucleotide-binding domain.
Assuntos
Flavina-Adenina Dinucleotídeo/química , NADP/química , NAD/química , Conformação Proteica , Água/química , Animais , Proteínas de Bactérias/química , Sítios de Ligação , Proteínas Fúngicas/química , Glicina/química , Humanos , Ligação de Hidrogênio , Modelos Moleculares , Estrutura Molecular , Proteínas de Plantas/química , Ligação Proteica , Dobramento de Proteína , Proteínas de Protozoários/química , Solventes/químicaRESUMO
The crystal structure of rat alpha-parvalbumin has been determined at 1.05 Angstrom resolution, using synchrotron data collected at Advanced Photon Source beamline 19-ID. After refinement with SHELX, employing anisotropic displacement parameters and riding hydrogen atoms, R = 0.132 and R(free) = 0.162. The average coordinate estimated standard deviations are 0.021 Angstrom and 0.038 Angstrom for backbone atoms and side-chain atoms, respectively. Besides providing a more precise view of the alpha-isoform than previously available, these data permit comparison with the 0.91 Angstrom structure determined for pike beta-parvalbumin. Visualization of the anisotropic displacement parameters as thermal ellipsoids yields insight into the atomic motion within the Ca(2+)-binding sites. The asymmetric unit includes three parvalbumin (PV) molecules. Interestingly, the EF site in one displays uncharacteristic flexibility. The ellipsoids for Asp-92 are particularly large and non-spherical, and the shape of the Ca(2+) ellipsoid implies significant vibrational motion perpendicular to the plane defined by the four y and z ligands. The relative dearth of crystal-packing interactions in this site suggests that the heightened flexibility may be the result of diminished intermolecular contacts. The implication is that, by impeding conformational mobility, crystal-packing forces may cause serious overestimation of EF-hand rigidity. The high quality of the data permitted 11 residues to be modeled in alternative side-chain conformations, including the two core residues, Ile-97 and Leu-105. The discrete disorder observed for Ile-97 may have functional ramifications, providing a mechanism for communicating binding status between the CD and EF binding loops and between the PV metal ion-binding domain and the N-terminal AB region.
Assuntos
Cálcio/química , Parvalbuminas/química , Software , Animais , Sítios de Ligação , Cálcio/metabolismo , Gráficos por Computador , Simulação por Computador , Cristalografia por Raios X , Parvalbuminas/metabolismo , Ligação Proteica , Estrutura Terciária de Proteína , RatosRESUMO
Using High-Throughput DNA Sequencing (HTS) to examine gene expression is rapidly becoming a -viable choice and is typically referred to as RNA-seq. Often the depth and breadth of coverage of RNA-seq data can exceed what is achievable using microarrays. However, the strengths of RNA-seq are often its greatest weaknesses. Accurately and comprehensively mapping millions of relatively short reads to a reference genome sequence can require not only specialized software, but also more structured and automated procedures to manage, analyze, and visualize the data. Additionally, the computational hardware required to efficiently process and store the data can be a necessary and often-overlooked component of a research plan. We discuss several aspects of the computational analysis of RNA-seq, including file management and data quality control, analysis, and visualization. We provide a framework for a standard nomenclature -system that can facilitate automation and the ability to track data provenance. Finally, we provide a general workflow of the computational analysis of RNA-seq and a downloadable package of scripts to automate the processing.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Animais , Gráficos por Computador , Perfilação da Expressão Gênica/métodos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Armazenamento e Recuperação da Informação/métodos , Análise de Sequência de RNA/normas , Software , Terminologia como AssuntoRESUMO
Maize genetic diversity has been used to understand the molecular basis of phenotypic variation and to improve agricultural efficiency and sustainability. We crossed 25 diverse inbred maize lines to the B73 reference line, capturing a total of 136,000 recombination events. Variation for recombination frequencies was observed among families, influenced by local (cis) genetic variation. We identified evidence for numerous minor single-locus effects but little two-locus linkage disequilibrium or segregation distortion, which indicated a limited role for genes with large effects and epistatic interactions on fitness. We observed excess residual heterozygosity in pericentromeric regions, which suggested that selection in inbred lines has been less efficient in these regions because of reduced recombination frequency. This implies that pericentromeric regions may contribute disproportionally to heterosis.
Assuntos
Mapeamento Cromossômico , Cromossomos de Plantas/genética , Variação Genética , Característica Quantitativa Herdável , Zea mays/genética , Alelos , Centrômero/genética , Cruzamentos Genéticos , Epistasia Genética , Flores/genética , Flores/crescimento & desenvolvimento , Genoma de Planta , Heterozigoto , Vigor Híbrido , Endogamia , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Recombinação Genética , Seleção Genética , Zea mays/classificação , Zea mays/fisiologiaRESUMO
The multifunctional Escherichia coli proline utilization A (PutA) flavoprotein functions both as a membrane-associated proline catabolic enzyme and as a transcriptional repressor of the proline utilization genes putA and putP. To better understand the mechanism of transcriptional regulation by PutA, we have mapped the put-regulatory region, determined a crystal structure of the PutA ribbon-helix-helix domain (PutA52, a polypeptide corresponding to residues 1-52 of E. coli PutA) complexed with DNA, and examined the thermodynamics of DNA binding to PutA52. Five operator sites, each containing the sequence motif 5'-GTTGCA-3', were identified using gel-shift analysis. Three of the sites are shown to be critical for repression of putA, whereas the two other sites are important for repression of putP. The 2.25-A-resolution crystal structure of PutA52 bound to one of the operators (operator 2; 21 bp) shows that the protein contacts a 9-bp fragment corresponding to the GTTGCA consensus motif plus three flanking base pairs. Since the operator sequences differ in flanking bases, the structure implies that PutA may have different affinities for the five operators. This hypothesis was explored using isothermal titration calorimetry. The binding of PutA52 to operator 2 is exothermic, with an enthalpy of -1.8 kcal/mol and a dissociation constant of 210 nM. Substitution of the flanking bases of operator 4 into operator 2 results in an unfavorable enthalpy of 0.2 kcal/mol and a 15-fold-lower affinity, showing that base pairs outside of the consensus motif impact binding. Structural and thermodynamic data suggest that hydrogen bonds between Lys9 and bases adjacent to the GTTGCA motif contribute to transcriptional regulation by fine-tuning the affinity of PutA for put control operators.
Assuntos
Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Escherichia coli/enzimologia , Escherichia coli/genética , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Prolina/metabolismo , Regulon/genética , Transcrição Gênica/genética , Sistemas de Transporte de Aminoácidos Neutros/genética , Sistemas de Transporte de Aminoácidos Neutros/metabolismo , Proteínas de Bactérias/química , Sequência de Bases , Sítios de Ligação , Calorimetria , Cristalografia por Raios X , DNA/química , DNA/genética , DNA/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Proteínas de Membrana/química , Modelos Moleculares , Conformação de Ácido Nucleico , Oxigênio/química , Oxigênio/metabolismo , Prolina/genética , Ligação Proteica , Estrutura Terciária de Proteína , Simportadores/genética , Simportadores/metabolismo , Titulometria , beta-Galactosidase/metabolismoRESUMO
Deposition of anti-DNA antibodies in the kidney contributes to the pathogenesis of the autoimmune disease, systemic lupus erythematosus. Antibodies that bind to hairpin-forming DNA ligands may be particularly prone to deposition. Here we report the first structure of a Fab complexed with hairpin-forming DNA. The ligand used for co-crystallization is 5'-d [CTG(CCTT)CAG]-3', which has a predicted hairpin structure consisting of a four-nucleotide loop (CCTT) and a stem of three base-pairs. The 1.95 A resolution crystal structure of Fab DNA-1 complexed with this ligand shows that the conformation of the bound ligand differs radically from the predicted hairpin conformation. The three base-pairs in the stem are absent in the bound form. The protein binds to the last six nucleotides at the 3' end of the ligand. These nucleotides form a loop (TTCA) closed by a G:C base-pair in the bound state. Stacking of aromatic side-chains against DNA bases is the dominant interaction in the complex. Interactions with the DNA backbone are conspicuously absent. Thermodynamics of binding are examined using isothermal titration calorimetry. The apparent dissociation constant is 4 microM, and binding is enthalpically favorable and entropically unfavorable. Increasing the number of base-pairs in the DNA stem from three to six decreases binding affinity. These data suggest a conformational selection binding mechanism in which the Fab binds preferentially to the unstructured state of the ligand. In this interpretation, the ligand binding and ligand folding equilibria are coupled, with lower hairpin stability leading to greater effective binding affinity. Thus, pre-organization of the DNA loop into the preferred binding conformation does not play a major role in complexation. Rather, it is argued that the stem of the hairpin serves to reduce the degrees of freedom in the free DNA ligand, thereby limiting the entropic cost attendant to complexation with the Fab.
Assuntos
Anticorpos Antinucleares/imunologia , DNA de Cadeia Simples/química , DNA de Cadeia Simples/imunologia , Sítios de Ligação , Calorimetria , Cristalografia por Raios X , Humanos , Fragmentos Fab das Imunoglobulinas , Conformação de Ácido Nucleico , Oligonucleotídeos , TermodinâmicaRESUMO
The Kelch repeat is a common sequence motif in eukaryotic genomes and is approximately 50 amino acids in length. The structure of the Kelch domain of the human Keap1 protein has previously been determined at 1.85 Angstrom, showing that each Kelch repeat forms one blade of a six-bladed beta-propeller. Here, use of 1.35 Angstrom SAD data for de novo structure determination of the Kelch domain and for refinement at atomic resolution is described. The high quality and resolution of the diffraction data and phase information allows a detailed analysis of the role of solvent in the structure of the Kelch repeat. Ten structurally conserved water molecules are identified in each blade of the Kelch beta-propeller. These appear to play distinct structural roles that include lining the central channel of the propeller, interacting with residues in loops between strands of the blade and making contacts with conserved residues in the Kelch repeat. Furthermore, we identify a conserved C-H...pi hydrogen bond between two key residues in the consensus Kelch repeat. This analysis extends our understanding of the structural roles of conserved residues in the Kelch repeat and highlights the potential role of solvent in maintaining the fold of this common eukaryotic structural motif.