RESUMO
The Y chromosome is theorized to facilitate evolution of sexual dimorphism by accumulating sexually antagonistic loci, but empirical support is scarce. Due to the lack of recombination, Y chromosomes are prone to degenerative processes, which poses a constraint on their adaptive potential. Yet, in the seed beetle, Callosobruchus maculatus segregating Y linked variation affects male body size and thereby sexual size dimorphism (SSD). Here, we assemble C. maculatus sex chromosome sequences and identify molecular differences associated with Y-linked SSD variation. The assembled Y chromosome is largely euchromatic and contains over 400 genes, many of which are ampliconic with a mixed autosomal and X chromosome ancestry. Functional annotation suggests that the Y chromosome plays important roles in males beyond primary reproductive functions. Crucially, we find that, besides an autosomal copy of the gene target of rapamycin (TOR), males carry an additional TOR copy on the Y chromosome. TOR is a conserved regulator of growth across taxa, and our results suggest that a Y-linked TOR provides a male specific opportunity to alter body size. A comparison of Y haplotypes associated with male size difference uncovers a copy number variation for TOR, where the haplotype associated with decreased male size, and thereby increased sexual dimorphism, has two additional TOR copies. This suggests that sexual conflict over growth has been mitigated by autosome to Y translocation of TOR followed by gene duplications. Our results reveal that despite of suppressed recombination, the Y chromosome can harbor adaptive potential as a male-limited supergene.
Assuntos
Besouros , Variações do Número de Cópias de DNA , Masculino , Animais , Besouros/genética , Caracteres Sexuais , Cromossomo Y , SementesRESUMO
C-H borylation is a high-value transformation in the synthesis of lead candidates for the pharmaceutical industry because a wide array of downstream coupling reactions is available. However, predicting its regioselectivity, especially in drug-like molecules that may contain multiple heterocycles, is not a trivial task. Using a data set of borylation reactions from Reaxys, we explored how a language model originally trained on USPTO_500_MT, a broad-scope set of patent data, can be used to predict the C-H borylation reaction product in different modes: product generation and site reactivity classification. Our fine-tuned T5Chem multitask language model can generate the correct product in 79% of cases. It can also classify the reactive aromatic C-H bonds with 95% accuracy and 88% positive predictive value, exceeding purpose-developed graph-based neural networks.
Assuntos
Hidrogênio , Hidrogênio/química , Modelos Químicos , Redes Neurais de ComputaçãoRESUMO
Databases of small, potentially bioactive molecules are ubiquitous across the industry and academia. Designed such that each unique compound should appear only once, the multiplicity of ways in which many compounds can be represented means that these databases require methods for standardizing the representation of chemistry. This is commonly achieved through the use of "Chemistry Business Rules", sets of predefined rules that describe the "house style" of the database in question. At Syngenta, the historical approach to the design of chemistry business rules has been to focus on consistency of representation, with chemical relevance given secondary consideration. In this work, we overturn that convention. Through the use of quantum chemistry calculations, we define a set of chemistry business rules for tautomer standardization that reproduces gas-phase energetic preferences. We go on to show that, compared to our historic approach, this method yields tautomers that are in better agreement with those observed experimentally in condensed phases and that are better suited for use in predictive models.
Assuntos
Isomerismo , Bases de Dados Factuais , Padrões de ReferênciaRESUMO
Advances in structural biology, such as cryo-electron microscopy (cryo-EM) have allowed for a number of sophisticated protein complexes to be characterized. However, often only a static snapshot of a protein complex is visualized despite the fact that conformational change is frequently inherent to biological function, as is the case for molecular motors. Computer simulations provide valuable insights into the different conformations available to a particular system that are not accessible using conventional structural techniques. For larger proteins and protein complexes, where a fully atomistic description would be computationally prohibitive, coarse-grained simulation techniques such as Elastic Network Modeling (ENM) are often employed, whereby each atom or group of atoms is linked by a set of springs whose properties can be customized according to the system of interest. Here we compare ENM with a recently proposed continuum model known as Fluctuating Finite Element Analysis (FFEA), which represents the biomolecule as a viscoelastic solid subject to thermal fluctuations. These two complementary computational techniques are used to answer a critical question in the rotary ATPase family; implicit within these motors is the need for a rotor axle and proton pump to rotate freely of the motor domain and stator structures. However, current single particle cryo-EM reconstructions have shown an apparent connection between the stators and rotor axle or pump region, hindering rotation. Both modeling approaches show a possible role for this connection and how it would significantly constrain the mobility of the rotary ATPase family.
Assuntos
Proteínas de Bactérias/química , Proteínas de Insetos/química , Modelos Moleculares , ATPases Translocadoras de Prótons/química , Proteínas de Saccharomyces cerevisiae/química , ATPases Vacuolares Próton-Translocadoras/química , Animais , Proteínas de Bactérias/metabolismo , Biocatálise , Bases de Dados de Proteínas , Módulo de Elasticidade , Análise de Elementos Finitos , Proteínas de Insetos/metabolismo , Manduca/enzimologia , Simulação de Dinâmica Molecular , Análise de Componente Principal , Conformação Proteica , Domínios e Motivos de Interação entre Proteínas , Multimerização Proteica , Subunidades Proteicas/química , Subunidades Proteicas/metabolismo , ATPases Translocadoras de Prótons/metabolismo , Saccharomyces cerevisiae/enzimologia , Proteínas de Saccharomyces cerevisiae/metabolismo , Thermus thermophilus/enzimologia , ATPases Vacuolares Próton-Translocadoras/metabolismoRESUMO
Male seminal fluid proteins often show signs of positive selection and divergent evolution, believed to reflect male-female coevolution. Yet, our understanding of the predicted concerted evolution of seminal fluid proteins and female reproductive proteins is limited. We sequenced, assembled, and annotated the genome of two species of seed beetles allowing a comparative analysis of four closely related species of these herbivorous insects. We compare the general pattern of evolution in genes encoding seminal fluid proteins and female reproductive proteins with those in digestive protein genes and well-conserved reference genes. We found that female reproductive proteins showed an overall ratio of nonsynonymous to synonymous substitutions (ω) similar to that of conserved genes, while seminal fluid proteins and digestive proteins exhibited higher overall ω values. Further, seminal fluid proteins and digestive proteins showed a higher proportion of sites putatively under positive selection, and explicit tests showed no difference in relaxed selection between protein types. Evolutionary rate covariation analyses showed that evolutionary rates among seminal fluid proteins were on average more closely correlated with those in female reproductive proteins than with either digestive or conserved genes. Gene expression showed the expected negative covariation with ω values, except for male-biased genes where this negative relationship was reversed. In conclusion, seminal fluid proteins showed relatively rapid evolution and signs of positive selection. In contrast, female reproductive proteins evolved at a lower rate under selective constraints, on par with genes known to be well conserved. Although our findings provide support for concerted evolution of seminal fluid proteins and female reproductive proteins, they also suggest that these two classes of proteins evolve under partly distinct selective regimes.
Assuntos
Besouros , Evolução Molecular , Seleção Genética , Animais , Besouros/genética , Masculino , Feminino , Proteínas de Insetos/genética , Filogenia , Genoma de Inseto , Proteínas de Plasma Seminal/genética , Genômica , Reprodução/genéticaRESUMO
Modified quantitative structure retention relationships (QSRRs) are proposed and applied to describe two retention data sets: A set of 94 metabolites studied by a hydrophilic interaction chromatography system under organic content gradient conditions and a set of tryptophan and its major metabolites analyzed by a reversed-phase chromatographic system under isocratic as well as pH and/or simultaneous pH and organic content gradient conditions. According to the proposed modification, an additional descriptor is added to a conventional QSRR expression, which is the analyte retention time, tR(R), measured under the same elution conditions, but in a second chromatographic column considered as a reference one. The 94 metabolites were studied on an Amide column using a Bare Silica column as a reference. For the second dataset, a Kinetex EVO C18 and a Gemini-NX column were used, where each of them was served as a reference column of the other. We found in all cases a significant improvement of the performance of the QSRR models when the descriptor tR(R) was considered.