Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 234
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Methods ; 21(7): 1340-1348, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38918604

RESUMO

The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.


Assuntos
Microscopia Crioeletrônica , Modelos Moleculares , Microscopia Crioeletrônica/métodos , Ligantes , SARS-CoV-2 , COVID-19/virologia , Escherichia coli , beta-Galactosidase/química , beta-Galactosidase/metabolismo , Conformação Proteica , Reprodutibilidade dos Testes
2.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38856167

RESUMO

The genome-wide single-cell chromosome conformation capture technique, i.e. single-cell Hi-C (ScHi-C), was recently developed to interrogate the conformation of the genome of individual cells. However, single-cell Hi-C data are much sparser than bulk Hi-C data of a population of cells, and noise in single-cell Hi-C makes it difficult to apply and analyze them in biological research. Here, we developed the first generative diffusion models (HiCDiff) to denoise single-cell Hi-C data in the form of chromosomal contact matrices. HiCDiff uses a deep residual network to remove the noise in the reverse process of diffusion and can be trained in both unsupervised and supervised learning modes. Benchmarked on several single-cell Hi-C test datasets, the diffusion models substantially remove the noise in single-cell Hi-C data. The unsupervised HiCDiff outperforms most supervised non-diffusion deep learning methods and achieves the performance comparable to the state-of-the-art supervised deep learning method in terms of multiple metrics, demonstrating that diffusion models are a useful approach to denoising single-cell Hi-C data. Moreover, its good performance holds on denoising bulk Hi-C data.


Assuntos
Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Biologia Computacional/métodos , Aprendizado Profundo , Algoritmos
3.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38860738

RESUMO

Picking protein particles in cryo-electron microscopy (cryo-EM) micrographs is a crucial step in the cryo-EM-based structure determination. However, existing methods trained on a limited amount of cryo-EM data still cannot accurately pick protein particles from noisy cryo-EM images. The general foundational artificial intelligence-based image segmentation model such as Meta's Segment Anything Model (SAM) cannot segment protein particles well because their training data do not include cryo-EM images. Here, we present a novel approach (CryoSegNet) of integrating an attention-gated U-shape network (U-Net) specially designed and trained for cryo-EM particle picking and the SAM. The U-Net is first trained on a large cryo-EM image dataset and then used to generate input from original cryo-EM images for SAM to make particle pickings. CryoSegNet shows both high precision and recall in segmenting protein particles from cryo-EM micrographs, irrespective of protein type, shape and size. On several independent datasets of various protein types, CryoSegNet outperforms two top machine learning particle pickers crYOLO and Topaz as well as SAM itself. The average resolution of density maps reconstructed from the particles picked by CryoSegNet is 3.33 Å, 7% better than 3.58 Å of Topaz and 14% better than 3.87 Å of crYOLO. It is publicly available at https://github.com/jianlin-cheng/CryoSegNet.


Assuntos
Microscopia Crioeletrônica , Processamento de Imagem Assistida por Computador , Microscopia Crioeletrônica/métodos , Processamento de Imagem Assistida por Computador/métodos , Proteínas/química , Inteligência Artificial , Algoritmos , Bases de Dados de Proteínas
4.
Hum Mol Genet ; 32(13): 2205-2218, 2023 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-37014740

RESUMO

As an aneuploidy, trisomy is associated with mammalian embryonic and postnatal abnormalities. Understanding the underlying mechanisms involved in mutant phenotypes is broadly important and may lead to new strategies to treat clinical manifestations in individuals with trisomies, such as trisomy 21 [Down syndrome (DS)]. Although increased gene dosage effects because of a trisomy may account for the mutant phenotypes, there is also the possibility that phenotypic consequences of a trisomy can arise because of the presence of a freely segregating extra chromosome with its own centromere, i.e. a 'free trisomy' independent of gene dosage effects. Presently, there are no reports of attempts to functionally separate these two types of effects in mammals. To fill this gap, here we describe a strategy that employed two new mouse models of DS, Ts65Dn;Df(17)2Yey/+ and Dp(16)1Yey/Df(16)8Yey. Both models carry triplications of the same 103 human chromosome 21 gene orthologs; however, only Ts65Dn;Df(17)2Yey/+ mice carry a free trisomy. Comparison of these models revealed the gene dosage-independent impacts of an extra chromosome at the phenotypic and molecular levels for the first time. They are reflected by impairments of Ts65Dn;Df(17)2Yey/+ males in T-maze tests when compared with Dp(16)1Yey/Df(16)8Yey males. Results from the transcriptomic analysis suggest the extra chromosome plays a major role in trisomy-associated expression alterations of disomic genes beyond gene dosage effects. This model system can now be used to deepen our mechanistic understanding of this common human aneuploidy and obtain new insights into the effects of free trisomies in other human diseases such as cancers.


Assuntos
Síndrome de Down , Masculino , Camundongos , Humanos , Animais , Síndrome de Down/genética , Trissomia/genética , Aneuploidia , Cromossomos , Dosagem de Genes , Modelos Animais de Doenças , Mamíferos/genética
5.
Bioinformatics ; 40(2)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38373819

RESUMO

MOTIVATION: The field of geometric deep learning has recently had a profound impact on several scientific domains such as protein structure prediction and design, leading to methodological advancements within and outside of the realm of traditional machine learning. Within this spirit, in this work, we introduce GCPNet, a new chirality-aware SE(3)-equivariant graph neural network designed for representation learning of 3D biomolecular graphs. We show that GCPNet, unlike previous representation learning methods for 3D biomolecules, is widely applicable to a variety of invariant or equivariant node-level, edge-level, and graph-level tasks on biomolecular structures while being able to (1) learn important chiral properties of 3D molecules and (2) detect external force fields. RESULTS: Across four distinct molecular-geometric tasks, we demonstrate that GCPNet's predictions (1) for protein-ligand binding affinity achieve a statistically significant correlation of 0.608, more than 5%, greater than current state-of-the-art methods; (2) for protein structure ranking achieve statistically significant target-local and dataset-global correlations of 0.616 and 0.871, respectively; (3) for Newtownian many-body systems modeling achieve a task-averaged mean squared error less than 0.01, more than 15% better than current methods; and (4) for molecular chirality recognition achieve a state-of-the-art prediction accuracy of 98.7%, better than any other machine learning method to date. AVAILABILITY AND IMPLEMENTATION: The source code, data, and instructions to train new models or reproduce our results are freely available at https://github.com/BioinfoMachineLearning/GCPNet.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Software
6.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38407301

RESUMO

MOTIVATION: Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise ratio. RESULTS: To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labeled cryo-EM protein particle dataset-CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score, and is poised to facilitate the automation of the cryo-EM protein particle picking. AVAILABILITY AND IMPLEMENTATION: The source code and data for CryoTransformer are openly available at: https://github.com/jianlin-cheng/CryoTransformer.


Assuntos
Inteligência Artificial , Software , Microscopia Crioeletrônica/métodos , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodos , Proteínas
7.
Proteomics ; : e2300471, 2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-38996351

RESUMO

Predicting protein function from protein sequence, structure, interaction, and other relevant information is important for generating hypotheses for biological experiments and studying biological systems, and therefore has been a major challenge in protein bioinformatics. Numerous computational methods had been developed to advance protein function prediction gradually in the last two decades. Particularly, in the recent years, leveraging the revolutionary advances in artificial intelligence (AI), more and more deep learning methods have been developed to improve protein function prediction at a faster pace. Here, we provide an in-depth review of the recent developments of deep learning methods for protein function prediction. We summarize the significant advances in the field, identify several remaining major challenges to be tackled, and suggest some potential directions to explore. The data sources and evaluation metrics widely used in protein function prediction are also discussed to assist the machine learning, AI, and bioinformatics communities to develop more cutting-edge methods to advance protein function prediction.

8.
Nat Methods ; 18(2): 156-164, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33542514

RESUMO

This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density.


Assuntos
Microscopia Crioeletrônica/métodos , Modelos Moleculares , Cristalografia por Raios X , Conformação Proteica , Proteínas/química
9.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34849575

RESUMO

New drug production, from target identification to marketing approval, takes over 12 years and can cost around $2.6 billion. Furthermore, the COVID-19 pandemic has unveiled the urgent need for more powerful computational methods for drug discovery. Here, we review the computational approaches to predicting protein-ligand interactions in the context of drug discovery, focusing on methods using artificial intelligence (AI). We begin with a brief introduction to proteins (targets), ligands (e.g. drugs) and their interactions for nonexperts. Next, we review databases that are commonly used in the domain of protein-ligand interactions. Finally, we survey and analyze the machine learning (ML) approaches implemented to predict protein-ligand binding sites, ligand-binding affinity and binding pose (conformation) including both classical ML algorithms and recent deep learning methods. After exploring the correlation between these three aspects of protein-ligand interaction, it has been proposed that they should be studied in unison. We anticipate that our review will aid exploration and development of more accurate ML-based prediction strategies for studying protein-ligand interactions.


Assuntos
Antivirais , Tratamento Farmacológico da COVID-19 , COVID-19 , Aprendizado Profundo , Descoberta de Drogas , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Antivirais/química , Antivirais/farmacocinética , COVID-19/metabolismo , Humanos , Ligantes
10.
Bioinformatics ; 39(39 Suppl 1): i318-i325, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387145

RESUMO

MOTIVATION: Millions of protein sequences have been generated by numerous genome and transcriptome sequencing projects. However, experimentally determining the function of the proteins is still a time consuming, low-throughput, and expensive process, leading to a large protein sequence-function gap. Therefore, it is important to develop computational methods to accurately predict protein function to fill the gap. Even though many methods have been developed to use protein sequences as input to predict function, much fewer methods leverage protein structures in protein function prediction because there was lack of accurate protein structures for most proteins until recently. RESULTS: We developed TransFun-a method using a transformer-based protein language model and 3D-equivariant graph neural networks to distill information from both protein sequences and structures to predict protein function. It extracts feature embeddings from protein sequences using a pre-trained protein language model (ESM) via transfer learning and combines them with 3D structures of proteins predicted by AlphaFold2 through equivariant graph neural networks. Benchmarked on the CAFA3 test dataset and a new test dataset, TransFun outperforms several state-of-the-art methods, indicating that the language model and 3D-equivariant graph neural networks are effective methods to leverage protein sequences and structures to improve protein function prediction. Combining TransFun predictions and sequence similarity-based predictions can further increase prediction accuracy. AVAILABILITY AND IMPLEMENTATION: The source code of TransFun is available at https://github.com/jianlin-cheng/TransFun.


Assuntos
Benchmarking , Idioma , Sequência de Aminoácidos , Redes Neurais de Computação , Software
11.
Bioinformatics ; 39(5)2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37144951

RESUMO

MOTIVATION: The state-of-art protein structure prediction methods such as AlphaFold are being widely used to predict structures of uncharacterized proteins in biomedical research. There is a significant need to further improve the quality and nativeness of the predicted structures to enhance their usability. In this work, we develop ATOMRefine, a deep learning-based, end-to-end, all-atom protein structural model refinement method. It uses a SE(3)-equivariant graph transformer network to directly refine protein atomic coordinates in a predicted tertiary structure represented as a molecular graph. RESULTS: The method is first trained and tested on the structural models in AlphaFoldDB whose experimental structures are known, and then blindly tested on 69 CASP14 regular targets and 7 CASP14 refinement targets. ATOMRefine improves the quality of both backbone atoms and all-atom conformation of the initial structural models generated by AlphaFold. It also performs better than two state-of-the-art refinement methods in multiple evaluation metrics including an all-atom model quality score-the MolProbity score based on the analysis of all-atom contacts, bond length, atom clashes, torsion angles, and side-chain rotamers. As ATOMRefine can refine a protein structure quickly, it provides a viable, fast solution for improving protein geometry and fixing structural errors of predicted structures through direct coordinate refinement. AVAILABILITY AND IMPLEMENTATION: The source code of ATOMRefine is available in the GitHub repository (https://github.com/BioinfoMachineLearning/ATOMRefine). All the required data for training and testing are available at https://doi.org/10.5281/zenodo.6944368.


Assuntos
Proteínas , Software , Proteínas/química , Conformação Molecular
12.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37498561

RESUMO

MOTIVATION: The spatial genome organization of a eukaryotic cell is important for its function. The development of single-cell technologies for probing the 3D genome conformation, especially single-cell chromosome conformation capture techniques, has enabled us to understand genome function better than before. However, due to extreme sparsity and high noise associated with single-cell Hi-C data, it is still difficult to study genome structure and function using the HiC-data of one single cell. RESULTS: In this work, we developed a deep learning method ScHiCEDRN based on deep residual networks and generative adversarial networks for the imputation and enhancement of Hi-C data of a single cell. In terms of both image evaluation and Hi-C reproducibility metrics, ScHiCEDRN outperforms the four deep learning methods (DeepHiC, HiCPlus, HiCSR, and Loopenhance) on enhancing the raw single-cell Hi-C data of human and Drosophila. The experiments also show that it can generate single-cell Hi-C data more suitable for identifying topologically associating domain boundaries and reconstructing 3D chromosome structures than the existing methods. Moreover, ScHiCEDRN's performance generalizes well across different single cells and cell types, and it can be applied to improving population Hi-C data. AVAILABILITY AND IMPLEMENTATION: The source code of ScHiCEDRN is available at the GitHub repository: https://github.com/BioinfoMachineLearning/ScHiCEDRN.


Assuntos
Cromossomos , Genoma , Humanos , Reprodutibilidade dos Testes , Estruturas Cromossômicas , Software , Cromatina
13.
Bioinformatics ; 39(39 Suppl 1): i308-i317, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387159

RESUMO

MOTIVATION: Proteins interact to form complexes to carry out essential biological functions. Computational methods such as AlphaFold-multimer have been developed to predict the quaternary structures of protein complexes. An important yet largely unsolved challenge in protein complex structure prediction is to accurately estimate the quality of predicted protein complex structures without any knowledge of the corresponding native structures. Such estimations can then be used to select high-quality predicted complex structures to facilitate biomedical research such as protein function analysis and drug discovery. RESULTS: In this work, we introduce a new gated neighborhood-modulating graph transformer to predict the quality of 3D protein complex structures. It incorporates node and edge gates within a graph transformer framework to control information flow during graph message passing. We trained, evaluated and tested the method (called DProQA) on newly-curated protein complex datasets before the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) and then blindly tested it in the 2022 CASP15 experiment. The method was ranked 3rd among the single-model quality assessment methods in CASP15 in terms of the ranking loss of TM-score on 36 complex targets. The rigorous internal and external experiments demonstrate that DProQA is effective in ranking protein complex structures. AVAILABILITY AND IMPLEMENTATION: The source code, data, and pre-trained models are available at https://github.com/jianlin-cheng/DProQA.


Assuntos
Pesquisa Biomédica , Descoberta de Drogas , Software
14.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36637199

RESUMO

MOTIVATION: Quality assessment (QA) of predicted protein tertiary structure models plays an important role in ranking and using them. With the recent development of deep learning end-to-end protein structure prediction techniques for generating highly confident tertiary structures for most proteins, it is important to explore corresponding QA strategies to evaluate and select the structural models predicted by them since these models have better quality and different properties than the models predicted by traditional tertiary structure prediction methods. RESULTS: We develop EnQA, a novel graph-based 3D-equivariant neural network method that is equivariant to rotation and translation of 3D objects to estimate the accuracy of protein structural models by leveraging the structural features acquired from the state-of-the-art tertiary structure prediction method-AlphaFold2. We train and test the method on both traditional model datasets (e.g. the datasets of the Critical Assessment of Techniques for Protein Structure Prediction) and a new dataset of high-quality structural models predicted only by AlphaFold2 for the proteins whose experimental structures were released recently. Our approach achieves state-of-the-art performance on protein structural models predicted by both traditional protein structure prediction methods and the latest end-to-end deep learning method-AlphaFold2. It performs even better than the model QA scores provided by AlphaFold2 itself. The results illustrate that the 3D-equivariant graph neural network is a promising approach to the evaluation of protein structural models. Integrating AlphaFold2 features with other complementary sequence and structural features is important for improving protein model QA. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/BioinfoMachineLearning/EnQA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Proteínas , Proteínas/química , Software , Rotação
15.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37589594

RESUMO

MOTIVATION: Sphagnum-dominated peatlands store a substantial amount of terrestrial carbon. The genus is undersampled and under-studied. No experimental crystal structure from any Sphagnum species exists in the Protein Data Bank and fewer than 200 Sphagnum-related genes have structural models available in the AlphaFold Protein Structure Database. Tools and resources are needed to help bridge these gaps, and to enable the analysis of other structural proteomes now made possible by accurate structure prediction. RESULTS: We present the predicted structural proteome (25 134 primary transcripts) of Sphagnum divinum computed using AlphaFold, structural alignment results of all high-confidence models against an annotated nonredundant crystallographic database of over 90,000 structures, a structure-based classification of putative Enzyme Commission (EC) numbers across this proteome, and the computational method to perform this proteome-scale structure-based annotation. AVAILABILITY AND IMPLEMENTATION: All data and code are available in public repositories, detailed at https://github.com/BSDExabio/SAFA. The structural models of the S. divinum proteome have been deposited in the ModelArchive repository at https://modelarchive.org/doi/10.5452/ma-ornl-sphdiv.


Assuntos
Proteínas de Plantas , Proteoma , Sphagnopsida , Sphagnopsida/química , Sphagnopsida/enzimologia , Proteínas de Plantas/química , Fluxo de Trabalho , Homologia Estrutural de Proteína
16.
Plant Cell ; 33(4): 901-916, 2021 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-33656551

RESUMO

The phenotypic consequences of the addition or subtraction of part of a chromosome is more severe than changing the dosage of the whole genome. By crossing diploid trisomies to a haploid inducer, we identified 17 distal segmental haploid disomies that cover ∼80% of the maize genome. Disomic haploids provide a level of genomic imbalance that is not ordinarily achievable in multicellular eukaryotes, allowing the impact to be stronger and more easily studied. Transcriptome size estimates revealed that a few disomies inversely modulate most of the transcriptome. Based on RNA sequencing, the expression levels of genes located on the varied chromosome arms (cis) in disomies ranged from being proportional to chromosomal dosage (dosage effect) to showing dosage compensation with no expression change with dosage. For genes not located on the varied chromosome arm (trans), an obvious trans-acting effect can be observed, with the majority showing a decreased modulation (inverse effect). The extent of dosage compensation of varied cis genes correlates with the extent of trans inverse effects across the 17 genomic regions studied. The results also have implications for the role of stoichiometry in gene expression, the control of quantitative traits, and the evolution of dosage-sensitive genes.


Assuntos
Regulação da Expressão Gênica de Plantas , Haploidia , Zea mays/genética , Cromossomos de Plantas , Mecanismo Genético de Compensação de Dose , Genes de Plantas , Genoma de Planta , Análise de Sequência de RNA
17.
Plant Cell ; 33(4): 917-939, 2021 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-33677584

RESUMO

Genomic imbalance caused by changing the dosage of individual chromosomes (aneuploidy) has a more detrimental effect than varying the dosage of complete sets of chromosomes (ploidy). We examined the impact of both increased and decreased dosage of 15 distal and 1 interstitial chromosomal regions via RNA-seq of maize (Zea mays) mature leaf tissue to reveal new aspects of genomic imbalance. The results indicate that significant changes in gene expression in aneuploids occur both on the varied chromosome (cis) and the remainder of the genome (trans), with a wider spread of modulation compared with the whole-ploidy series of haploid to tetraploid. In general, cis genes in aneuploids range from a gene-dosage effect to dosage compensation, whereas for trans genes the most common effect is an inverse correlation in that expression is modulated toward the opposite direction of the varied chromosomal dosage, although positive modulations also occur. Furthermore, this analysis revealed the existence of increased and decreased effects in which the expression of many genes under genome imbalance are modulated toward the same direction regardless of increased or decreased chromosomal dosage, which is predicted from kinetic considerations of multicomponent molecular interactions. The findings provide novel insights into understanding mechanistic aspects of gene regulation.


Assuntos
Diploide , Regulação da Expressão Gênica de Plantas , Zea mays/genética , Aneuploidia , Cromossomos de Plantas , Mecanismo Genético de Compensação de Dose , Genoma de Planta , Ploidias
18.
Proc Natl Acad Sci U S A ; 118(23)2021 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-34088847

RESUMO

B chromosomes are enigmatic elements in thousands of plant and animal genomes that persist in populations despite being nonessential. They circumvent the laws of Mendelian inheritance but the molecular mechanisms underlying this behavior remain unknown. Here we present the sequence, annotation, and analysis of the maize B chromosome providing insight into its drive mechanism. The sequence assembly reveals detailed locations of the elements involved with the cis and trans functions of its drive mechanism, consisting of nondisjunction at the second pollen mitosis and preferential fertilization of the egg by the B-containing sperm. We identified 758 protein-coding genes in 125.9 Mb of B chromosome sequence, of which at least 88 are expressed. Our results demonstrate that transposable elements in the B chromosome are shared with the standard A chromosome set but multiple lines of evidence fail to detect a syntenic genic region in the A chromosomes, suggesting a distant origin. The current gene content is a result of continuous transfer from the A chromosomal complement over an extended evolutionary time with subsequent degradation but with selection for maintenance of this nonvital chromosome.


Assuntos
Cromossomos de Plantas/genética , Evolução Molecular , Pólen/genética , Proteínas da Gravidez/genética , Zea mays/genética , Meiose/genética , Mitose/genética
19.
Plant J ; 110(1): 193-211, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34997647

RESUMO

The non-essential supernumerary maize (Zea mays) B chromosome (B) has recently been shown to contain active genes and to be capable of impacting gene expression of the A chromosomes. However, the effect of the B chromosome on gene expression is still unclear. In addition, it is unknown whether the accumulation of the B chromosome has a cumulative effect on gene expression. To examine these questions, the global expression of genes, microRNAs (miRNAs), and transposable elements (TEs) of leaf tissue of maize W22 plants with 0-7 copies of the B chromosome was studied. All experimental genotypes with B chromosomes displayed a trend of upregulated gene expression for a subset of A-located genes compared to the control. Over 3000 A-located genes are significantly differentially expressed in all experimental genotypes with the B chromosome relative to the control. Modulations of these genes are largely determined by the presence rather than the copy number of the B chromosome. By contrast, the expression of most B-located genes is positively correlated with B copy number, showing a proportional gene dosage effect. The B chromosome also causes increased expression of A-located miRNAs. Differentially expressed miRNAs potentially regulate their targets in a cascade of effects. Furthermore, the varied copy number of the B chromosome leads to the differential expression of A-located and B-located TEs. The findings provide novel insights into the function and properties of the B chromosome.


Assuntos
Cromossomos de Plantas , Zea mays , Aneuploidia , Cromossomos de Plantas/genética , Elementos de DNA Transponíveis/genética , Expressão Gênica , Regulação da Expressão Gênica de Plantas/genética , Zea mays/genética
20.
Proteins ; 91(12): 1889-1902, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37357816

RESUMO

Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and performed very well in estimating the global structure accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analyzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA.


Assuntos
Aprendizado Profundo , Modelos Moleculares , Proteínas/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA