Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 152
Filtrar
Más filtros

País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nat Methods ; 21(2): 279-289, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38167654

RESUMEN

Leveraging iterative alignment search through genomic and metagenome sequence databases, we report the DeepMSA2 pipeline for uniform protein single- and multichain multiple-sequence alignment (MSA) construction. Large-scale benchmarks show that DeepMSA2 MSAs can remarkably increase the accuracy of protein tertiary and quaternary structure predictions compared with current state-of-the-art methods. An integrated pipeline with DeepMSA2 participated in the most recent CASP15 experiment and created complex structural models with considerably higher quality than the AlphaFold2-Multimer server (v.2.2.0). Detailed data analyses show that the major advantage of DeepMSA2 lies in its balanced alignment search and effective model selection, and in the power of integrating huge metagenomics databases. These results demonstrate a new avenue to improve deep learning protein structure prediction through advanced MSA construction and provide additional evidence that optimization of input information to deep learning-based structure prediction methods must be considered with as much care as the design of the predictor itself.


Asunto(s)
Aprendizaje Profundo , Biología Computacional/métodos , Proteínas/genética , Proteínas/química , Alineación de Secuencia , Genómica , Algoritmos
2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39038936

RESUMEN

Sequence database searches followed by homology-based function transfer form one of the oldest and most popular approaches for predicting protein functions, such as Gene Ontology (GO) terms. These searches are also a critical component in most state-of-the-art machine learning and deep learning-based protein function predictors. Although sequence search tools are the basis of homology-based protein function prediction, previous studies have scarcely explored how to select the optimal sequence search tools and configure their parameters to achieve the best function prediction. In this paper, we evaluate the effect of using different options from among popular search tools, as well as the impacts of search parameters, on protein function prediction. When predicting GO terms on a large benchmark dataset, we found that BLASTp and MMseqs2 consistently exceed the performance of other tools, including DIAMOND-one of the most popular tools for function prediction-under default search parameters. However, with the correct parameter settings, DIAMOND can perform comparably to BLASTp and MMseqs2 in function prediction. Additionally, we developed a new scoring function to derive GO prediction from homologous hits that consistently outperform previously proposed scoring functions. These findings enable the improvement of almost all protein function prediction algorithms with a few easily implementable changes in their sequence homolog-based component. This study emphasizes the critical role of search parameter settings in homology-based function transfer and should have an important contribution to the development of future protein function prediction algorithms.


Asunto(s)
Bases de Datos de Proteínas , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biología Computacional/métodos , Ontología de Genes , Algoritmos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Aprendizaje Automático
3.
Nucleic Acids Res ; 52(D1): D404-D412, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37522378

RESUMEN

With the progress of structural biology, the Protein Data Bank (PDB) has witnessed rapid accumulation of experimentally solved protein structures. Since many structures are determined with purification and crystallization additives that are unrelated to a protein's in vivo function, it is nontrivial to identify the subset of protein-ligand interactions that are biologically relevant. We developed the BioLiP2 database (https://zhanggroup.org/BioLiP) to extract biologically relevant protein-ligand interactions from the PDB database. BioLiP2 assesses the functional relevance of the ligands by geometric rules and experimental literature validations. The ligand binding information is further enriched with other function annotations, including Enzyme Commission numbers, Gene Ontology terms, catalytic sites, and binding affinities collected from other databases and a manual literature survey. Compared to its predecessor BioLiP, BioLiP2 offers significantly greater coverage of nucleic acid-protein interactions, and interactions involving large complexes that are unavailable in PDB format. BioLiP2 also integrates cutting-edge structural alignment algorithms with state-of-the-art structure prediction techniques, which for the first time enables composite protein structure and sequence-based searching and significantly enhances the usefulness of the database in structure-based function annotations. With these new developments, BioLiP2 will continue to be an important and comprehensive database for docking, virtual screening, and structure-based protein function analyses.


Asunto(s)
Algoritmos , Bases de Datos de Proteínas , Proteínas , Sitios de Unión , Ligandos , Proteínas/química
4.
Nat Methods ; 19(9): 1109-1115, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36038728

RESUMEN

Structure comparison and alignment are of fundamental importance in structural biology studies. We developed the first universal platform, US-align, to uniformly align monomer and complex structures of different macromolecules-proteins, RNAs and DNAs. The pipeline is built on a uniform TM-score objective function coupled with a heuristic alignment searching algorithm. Large-scale benchmarks demonstrated consistent advantages of US-align over state-of-the-art methods in pairwise and multiple structure alignments of different molecules. Detailed analyses showed that the main advantage of US-align lies in the extensive optimization of the unified objective function powered by efficient heuristic search iterations, which substantially improve the accuracy and speed of the structural alignment process. Meanwhile, the universal protocol fusing different molecular and structural types helps facilitate the heterogeneous oligomer structure comparison and template-based protein-protein and protein-RNA/DNA docking.


Asunto(s)
Ácidos Nucleicos , Programas Informáticos , Algoritmos , Sustancias Macromoleculares , Proteínas/química , ARN , Alineación de Secuencia
5.
Microb Pathog ; 190: 106630, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38556102

RESUMEN

Porcine circovirus type 2 (PCV2) is a globally prevalent infectious pathogen affecting swine, with its capsid protein (Cap) being the sole structural protein critical for vaccine development. Prior research has demonstrated that PCV2 Cap proteins produced in Escherichia coli (E. coli) can form virus-like particles (VLPs) in vitro, and nuclear localization signal peptides (NLS) play a pivotal role in stabilizing PCV2 VLPs. Recently, PCV2d has emerged as an important strain within the PCV2 epidemic. In this study, we systematically optimized the PCV2d Cap protein and successfully produced intact PCV2d VLPs containing NLS using E. coli. The recombinant PCV2d Cap protein was purified through affinity chromatography, yielding 7.5 mg of recombinant protein per 100 ml of bacterial culture. We augmented the conventional buffer system with various substances such as arginine, ß-mercaptoethanol, glycerol, polyethylene glycol, and glutathione to promote VLP assembly. The recombinant PCV2d Cap self-assembled into VLPs approximately 20 nm in diameter, featuring uniform distribution and exceptional stability in the optimized buffer. We developed the vaccine and immunized pigs and mice, evaluating the immunogenicity of the PCV2d VLPs vaccine by measuring PCV2-IgG, IL-4, TNF-α, and IFN-γ levels, comparing them to commercial vaccines utilizing truncated PCV2 Cap antigens. The HE staining and immunohistochemical tests confirmed that the PCV2 VLPs vaccine offered robust protection. The results revealed that animals vaccinated with the PCV2d VLPs vaccine exhibited high levels of PCV2 antibodies, with TNF-α and IFN-γ levels rapidly increasing at 14 days post-immunization, which were higher than those observed in commercially available vaccines, particularly in the mouse trial. This could be due to the fact that full-length Cap proteins can assemble into more stable PCV2d VLPs in the assembling buffer. In conclusion, our produced PCV2d VLPs vaccine elicited stronger immune responses in pigs and mice compared to commercial vaccines. The PCV2d VLPs from this study serve as an excellent candidate vaccine antigen, providing insights for PCV2d vaccine research.


Asunto(s)
Anticuerpos Antivirales , Proteínas de la Cápside , Circovirus , Escherichia coli , Proteínas Recombinantes , Vacunas de Partículas Similares a Virus , Animales , Circovirus/inmunología , Circovirus/genética , Porcinos , Vacunas de Partículas Similares a Virus/inmunología , Vacunas de Partículas Similares a Virus/genética , Proteínas de la Cápside/inmunología , Proteínas de la Cápside/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Ratones , Anticuerpos Antivirales/inmunología , Anticuerpos Antivirales/sangre , Proteínas Recombinantes/inmunología , Proteínas Recombinantes/genética , Infecciones por Circoviridae/prevención & control , Infecciones por Circoviridae/inmunología , Enfermedades de los Porcinos/prevención & control , Vacunas Virales/inmunología , Vacunas Virales/genética , Desarrollo de Vacunas , Antígenos Virales/inmunología , Antígenos Virales/genética , Inmunoglobulina G/sangre , Análisis Costo-Beneficio , Femenino , Interferón gamma/metabolismo , Inmunogenicidad Vacunal
6.
J Chem Inf Model ; 64(3): 1043-1049, 2024 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-38270339

RESUMEN

The quickly increasing size of the Protein Data Bank is challenging biologists to develop a more scalable protein structure alignment tool for fast structure database search. Although many protein structure search algorithms and programs have been designed and implemented for this purpose, most require a large amount of computational time. We propose a novel protein structure search approach, TM-search, which is based on the pairwise structure alignment program TM-align and a new iterative clustering algorithm. Benchmark tests demonstrate that TM-search is 27 times faster than a TM-align full database search while still being able to identify ∼90% of all high TM-score hits, which is 2-10 times more than other existing programs such as Foldseek, Dali, and PSI-BLAST.


Asunto(s)
Algoritmos , Proteínas , Bases de Datos de Proteínas , Alineación de Secuencia , Proteínas/química , Benchmarking , Programas Informáticos
7.
BMC Bioinformatics ; 24(1): 260, 2023 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-37340457

RESUMEN

BACKGROUND: Although mmCIF is the current official format for deposition of protein and nucleic acid structures to the protein data bank (PDB) database, the legacy PDB format is still the primary supported format for many structural bioinformatics tools. Therefore, reliable software to convert mmCIF structure files to PDB files is needed. Unfortunately, existing conversion programs fail to correctly convert many mmCIF files, especially those with many atoms and/or long chain identifies. RESULTS: This study proposed BeEM, which converts any mmCIF format structure files to PDB format. BeEM conversion faithfully retains all atomic and chain information, including chain IDs with more than 2 characters, which are not supported by any existing mmCIF to PDB converters. The conversion speed of BeEM is at least ten times faster than existing converters such as MAXIT and Phenix. Part of the reason for the speed improvement is the avoidance of conversion between numerical values and text strings. CONCLUSION: BeEM is a fast and accurate tool for mmCIF-to-PDB format conversion, which is a common procedure in structural biology. The source code is available under the BSD licence at https://github.com/kad-ecoli/BeEM/ .


Asunto(s)
Proteínas , Programas Informáticos , Proteínas/química , Bases de Datos de Proteínas
8.
Eur J Neurosci ; 57(6): 900-917, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36725691

RESUMEN

The bed nuclei of the stria terminalis (BST) is recognised as a pivotal integrative centre for monitoring emotional valence. It is implicated in the regulation of diverse affective states and motivated behaviours, and decades of research have firmly established its critical role in anxiety-related behavioural processes. Researchers have recently intricately dissected the BST's dynamic activities, its connection patterns and its functions with respect to specific cell types using multiple techniques such as optogenetics, in vivo calcium imaging and transgenic tools to unmask the complex circuitry mechanisms that underlie anxiety. In this review, we principally focus on studies of anxiety-involved neuromodulators within the BST and provide a comprehensive architecture of the anxiety network-highlighting the BST as a key hub in orchestrating anxiety-like behaviour. We posit that these promising efforts will contribute to the identification of an accurate roadmap for future treatment of anxiety disorders.


Asunto(s)
Ansiedad , Núcleos Septales , Animales , Humanos , Ansiedad/psicología , Trastornos de Ansiedad/metabolismo , Emociones , Animales Modificados Genéticamente , Núcleos Septales/metabolismo
9.
Bioinformatics ; 38(10): 2937-2939, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35561202

RESUMEN

MOTIVATION: The full description of nucleic acid conformation involves eight torsion angles per nucleotide. To simplify this description, we previously developed a representation of the nucleic acid backbone that assigns each nucleotide a pair of pseudo-torsion angles (eta and theta defined by P and C4' atoms; or eta' and theta' defined by P and C1' atoms). A Java program, AMIGOS II, is currently available for calculating eta and theta angles for RNA and for performing motif searches based on eta and theta angles. However, AMIGOS II lacks the ability to parse DNA structures and to calculate eta' and theta' angles. It also has little visualization capacity for 3D structure, making it difficult for users to interpret the computational results. RESULTS: We present AMIGOS III, a PyMOL plugin that calculates the pseudo-torsion angles eta, theta, eta' and theta' for both DNA and RNA structures and performs motif searching based on these angles. Compared to AMIGOS II, AMIGOS III offers improved pseudo-torsion angle visualization for RNA and faster nucleic acid worm database generation; it also introduces pseudo-torsion angle visualization for DNA and nucleic acid worm visualization. Its integration into PyMOL enables easy preparation of tertiary structure inputs and intuitive visualization of involved structures. AVAILABILITY AND IMPLEMENTATION: https://github.com/pylelab/AMIGOSIII. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Ácidos Nucleicos , ADN/química , Conformación de Ácido Nucleico , Ácidos Nucleicos/química , Nucleótidos/química , ARN/química
10.
Bioinformatics ; 38(6): 1754-1755, 2022 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-34978562

RESUMEN

MOTIVATION: Accurate and efficient predictions of protein structures play an important role in understanding their functions. Iterative Threading Assembly Refinement (I-TASSER) is one of the most successful and widely used protein structure prediction methods in the recent community-wide CASP experiments. Yet, the computational efficiency of I-TASSER is one of the limiting factors that prevent its application for large-scale structure modeling. RESULTS: We present I-TASSER for Graphics Processing Units (GPU-I-TASSER), a GPU accelerated I-TASSER protein structure prediction tool for fast and accurate protein structure prediction. Our implementation is based on OpenACC parallelization of the replica-exchange Monte Carlo simulations to enhance the speed of I-TASSER by extending its capabilities to the GPU architecture. On a benchmark dataset of 71 protein structures, GPU-I-TASSER achieves on average a 10× speedup with comparable structure prediction accuracy compared to the CPU version of the I-TASSER. AVAILABILITY AND IMPLEMENTATION: The complete source code for GPU-I-TASSER can be downloaded and used without restriction from https://zhanggroup.org/GPU-I-TASSER/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas , Programas Informáticos , Proteínas/química , Método de Montecarlo , Algoritmos
11.
PLoS Comput Biol ; 18(12): e1010793, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36548439

RESUMEN

Accurate identification of protein function is critical to elucidate life mechanisms and design new drugs. We proposed a novel deep-learning method, ATGO, to predict Gene Ontology (GO) attributes of proteins through a triplet neural-network architecture embedded with pre-trained language models from protein sequences. The method was systematically tested on 1068 non-redundant benchmarking proteins and 3328 targets from the third Critical Assessment of Protein Function Annotation (CAFA) challenge. Experimental results showed that ATGO achieved a significant increase of the GO prediction accuracy compared to the state-of-the-art approaches in all aspects of molecular function, biological process, and cellular component. Detailed data analyses showed that the major advantage of ATGO lies in the utilization of pre-trained transformer language models which can extract discriminative functional pattern from the feature embeddings. Meanwhile, the proposed triplet network helps enhance the association of functional similarity with feature similarity in the sequence embedding space. In addition, it was found that the combination of the network scores with the complementary homology-based inferences could further improve the accuracy of the predicted models. These results demonstrated a new avenue for high-accuracy deep-learning function prediction that is applicable to large-scale protein function annotations from sequence alone.


Asunto(s)
Biología Computacional , Proteínas , Ontología de Genes , Biología Computacional/métodos , Proteínas/genética , Proteínas/metabolismo , Redes Neurales de la Computación , Lenguaje
12.
Cell Mol Life Sci ; 79(3): 176, 2022 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-35247097

RESUMEN

The brain-expressed ubiquilins (UBQLNs) 1, 2 and 4 are a family of ubiquitin adaptor proteins that participate broadly in protein quality control (PQC) pathways, including the ubiquitin proteasome system (UPS). One family member, UBQLN2, has been implicated in numerous neurodegenerative diseases including ALS/FTD. UBQLN2 typically resides in the cytoplasm but in disease can translocate to the nucleus, as in Huntington's disease where it promotes the clearance of mutant Huntingtin. How UBQLN2 translocates to the nucleus and clears aberrant nuclear proteins, however, is not well understood. In a mass spectrometry screen to discover UBQLN2 interactors, we identified a family of small (13 kDa), highly homologous uncharacterized proteins, RTL8, and confirmed the interaction between UBQLN2 and RTL8 both in vitro using recombinant proteins and in vivo using mouse brain tissue. Under endogenous and overexpressed conditions, RTL8 localizes to nucleoli. When co-expressed with UBQLN2, RTL8 promotes nuclear translocation of UBQLN2. RTL8 also facilitates UBQLN2's nuclear translocation during heat shock. UBQLN2 and RTL8 colocalize within ubiquitin-enriched subnuclear structures containing PQC components. The robust effect of RTL8 on the nuclear translocation and subnuclear localization of UBQLN2 does not extend to the other brain-expressed ubiquilins, UBQLN1 and UBQLN4. Moreover, compared to UBQLN1 and UBQLN4, UBQLN2 preferentially stabilizes RTL8 levels in human cell lines and in mouse brain, supporting functional heterogeneity among UBQLNs. As a novel UBQLN2 interactor that recruits UBQLN2 to specific nuclear compartments, RTL8 may regulate UBQLN2 function in nuclear protein quality control.


Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/metabolismo , Proteínas de la Membrana/metabolismo , Proteínas Adaptadoras Transductoras de Señales/deficiencia , Proteínas Adaptadoras Transductoras de Señales/genética , Secuencia de Aminoácidos , Animales , Proteínas Relacionadas con la Autofagia/deficiencia , Proteínas Relacionadas con la Autofagia/genética , Proteínas Relacionadas con la Autofagia/metabolismo , Encéfalo/metabolismo , Proteínas Portadoras/genética , Proteínas Portadoras/metabolismo , Nucléolo Celular/metabolismo , Células HEK293 , Humanos , Proteínas de la Membrana/química , Proteínas de la Membrana/genética , Ratones , Ratones Noqueados , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Unión Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas Recombinantes/biosíntesis , Proteínas Recombinantes/química , Proteínas Recombinantes/aislamiento & purificación , Alineación de Secuencia , Temperatura , Ubiquitina/metabolismo
13.
J Environ Sci (China) ; 124: 1-10, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36182119

RESUMEN

Recently, air pollution especially fine particulate matters (PM2.5) and ozone (O3) has become a severe issue in China. In this study, we first characterized the temporal trends of PM2.5 and O3 for Beijing, Guangzhou, Shanghai, and Wuhan respectively during 2018-2020. The annual mean PM2.5 has decreased by 7.82%-33.92%, while O3 concentration showed insignificant variations by -6.77%-4.65% during 2018-2020. The generalized additive models (GAMs) were implemented to quantify the contribution of individual meteorological factors and their gas precursors on PM2.5 and O3. On a short-term perspective, GAMs modeling shows that the daily variability of PM2.5 concentration is largely related to the variation of precursor gases (R = 0.67-0.90), while meteorological conditions mainly affect the daily variability of O3 concentration (R = 0.65-0.80) during 2018-2020. The impact of COVID-19 lockdown on PM2.5 and O3 concentrations were also quantified by using GAMs. During the 2020 lockdown, PM2.5 decreased significantly for these megacities, yet the ozone concentration showed an increasing trend compared to 2019. The GAMs analysis indicated that the contribution of precursor gases to PM2.5 and O3 changes is 3-8 times higher than that of meteorological factors. In general, GAMs modeling on air quality is helpful to the understanding and control of PM2.5 and O3 pollution in China.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , COVID-19 , Ozono , Contaminantes Atmosféricos/análisis , Contaminación del Aire/análisis , China , Ciudades , Control de Enfermedades Transmisibles , Monitoreo del Ambiente , Humanos , Ozono/análisis , Material Particulado/análisis
14.
Circulation ; 144(14): 1145-1159, 2021 10 05.
Artículo en Inglés | MEDLINE | ID: mdl-34346740

RESUMEN

BACKGROUND: Loeys-Dietz syndrome (LDS) is an inherited disorder predisposing individuals to thoracic aortic aneurysm and dissection. Currently, there are no medical treatments except surgical resection. Although the genetic basis of LDS is well-understood, molecular mechanisms underlying the disease remain elusive, impeding the development of a therapeutic strategy. In addition, aortic smooth muscle cells (SMCs) have heterogenous embryonic origins, depending on their spatial location, and lineage-specific effects of pathogenic variants on SMC function, likely causing regionally constrained LDS manifestations, have been unexplored. METHODS: We identified an LDS family with a dominant pathogenic variant in the TGFBR1 gene (TGFBR1A230T) causing aortic root aneurysm and dissection. To accurately model the molecular defects caused by this mutation, we used human induced pluripotent stem cells from a subject with normal aorta to generate human induced pluripotent stem cells carrying TGFBR1A230T, and corrected the mutation in patient-derived human induced pluripotent stem cells using CRISPR-Cas9 gene editing. After their lineage-specific SMC differentiation through cardiovascular progenitor cell (CPC) and neural crest stem cell lineages, we used conventional molecular techniques and single-cell RNA sequencing to characterize the molecular defects. The resulting data led to subsequent molecular and functional rescue experiments using activin A and rapamycin. RESULTS: Our results indicate the TGFBR1A230T mutation impairs contractile transcript and protein levels, and function in CPC-SMC, but not in neural crest stem cell-SMC. Single-cell RNA sequencing results implicate defective differentiation even in TGFBR1A230T/+ CPC-SMC including disruption of SMC contraction and extracellular matrix formation. Comparison of patient-derived and mutation-corrected cells supported the contractile phenotype observed in the mutant CPC-SMC. TGFBR1A230T selectively disrupted SMAD3 (SMAD family member 3) and AKT (AKT serine/threonine kinase) activation in CPC-SMC, and led to increased cell proliferation. Consistently, single-cell RNA sequencing revealed molecular similarities between a loss-of-function SMAD3 mutation (SMAD3c.652delA/+) and TGFBR1A230T/+. Last, combination treatment with activin A and rapamycin during or after SMC differentiation significantly improved the mutant CPC-SMC contractile gene expression and function, and rescued the mechanical properties of mutant CPC-SMC tissue constructs. CONCLUSIONS: This study reveals that a pathogenic TGFBR1 variant causes lineage-specific SMC defects informing the etiology of LDS-associated aortic root aneurysm. As a potential pharmacological strategy, our results highlight a combination treatment with activin A and rapamycin that can rescue the SMC defects caused by the variant.


Asunto(s)
Células Madre Pluripotentes Inducidas/metabolismo , Síndrome de Loeys-Dietz/genética , Receptor Tipo I de Factor de Crecimiento Transformador beta/metabolismo , Humanos , Síndrome de Loeys-Dietz/patología
15.
PLoS Comput Biol ; 17(3): e1008865, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33770072

RESUMEN

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Análisis de Secuencia de Proteína/métodos , Biología Computacional , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Proteínas/metabolismo , Reproducibilidad de los Resultados
16.
Environ Sci Technol ; 56(14): 9988-9998, 2022 07 19.
Artículo en Inglés | MEDLINE | ID: mdl-35767687

RESUMEN

Nitrogen dioxide (NO2) at the ground level poses a serious threat to environmental quality and public health. This study developed a novel, artificial intelligence approach by integrating spatiotemporally weighted information into the missing extra-trees and deep forest models to first fill the satellite data gaps and increase data availability by 49% and then derive daily 1 km surface NO2 concentrations over mainland China with full spatial coverage (100%) for the period 2019-2020 by combining surface NO2 measurements, satellite tropospheric NO2 columns derived from TROPOMI and OMI, atmospheric reanalysis, and model simulations. Our daily surface NO2 estimates have an average out-of-sample (out-of-city) cross-validation coefficient of determination of 0.93 (0.71) and root-mean-square error of 4.89 (9.95) µg/m3. The daily seamless high-resolution and high-quality dataset "ChinaHighNO2" allows us to examine spatial patterns at fine scales such as the urban-rural contrast. We observed systematic large differences between urban and rural areas (28% on average) in surface NO2, especially in provincial capitals. Strong holiday effects were found, with average declines of 22 and 14% during the Spring Festival and the National Day in China, respectively. Unlike North America and Europe, there is little difference between weekdays and weekends (within ±1 µg/m3). During the COVID-19 pandemic, surface NO2 concentrations decreased considerably and then gradually returned to normal levels around the 72nd day after the Lunar New Year in China, which is about 3 weeks longer than the tropospheric NO2 column, implying that the former can better represent the changes in NOx emissions.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , COVID-19 , Contaminantes Atmosféricos/análisis , Contaminación del Aire/análisis , Inteligencia Artificial , China , Monitoreo del Ambiente , Humanos , Dióxido de Nitrógeno/análisis , Pandemias
17.
Macromol Rapid Commun ; 43(2): e2100449, 2022 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-34624165

RESUMEN

Processable microporous organic polymers (MOPs) attract incomparable research interest because their vairous types, such as monoliths and membranes are for practical application. Most processable MOPs usually need harsh conditions such as the use of expensive metal catalysts, specialized stereospecific monomers, etc., which restrict the sustainable and real applications of processable MOPs. Therefore, the economical mass production of processable MOPs remains a formidable challenge. Herein, a novel strategy is reported for constructing processable hypercrosslinked polymers (HCPs) need two steps synthesis of pre-crosslinking and deep-crosslinking using divinylbenzene (DVB) as a self-crosslinking monomer under the catalysis of a small amount of FeCl3 . The resulting HCPs monoliths possess high BET surface area of 1033-1056 m2 g-1 with hierarchical porosity, and show excellent mechanical strength up to 65 MPa. It is, to the best of authors' knowledge, the first report of using aromatic vinyl monomers as self-crosslinking monomers to generate HCPs monoliths with high surface area, yielding no by-products, and high mechanical strength.


Asunto(s)
Polímeros , Catálisis , Porosidad
18.
Proc Natl Acad Sci U S A ; 116(32): 15930-15938, 2019 08 06.
Artículo en Inglés | MEDLINE | ID: mdl-31341084

RESUMEN

Most proteins exist with multiple domains in cells for cooperative functionality. However, structural biology and protein folding methods are often optimized for single-domain structures, resulting in a rapidly growing gap between the improved capability for tertiary structure determination and high demand for multidomain structure models. We have developed a pipeline, termed DEMO, for constructing multidomain protein structures by docking-based domain assembly simulations, with interdomain orientations determined by the distance profiles from analogous templates as detected through domain-level structure alignments. The pipeline was tested on a comprehensive benchmark set of 356 proteins consisting of 2-7 continuous and discontinuous domains, for which DEMO generated models with correct global fold (TM-score > 0.5) for 86% of cases with continuous domains and for 100% of cases with discontinuous domain structures, starting from randomly oriented target-domain structures. DEMO was also applied to reassemble multidomain targets in the CASP12 and CASP13 experiments using domain structures excised from the top server predictions, where the full-length DEMO models showed a significantly improved quality over the original server models. Finally, sparse restraints of mass spectrometry-generated cross-linking data and cryo-EM density maps are incorporated into DEMO, resulting in improvements in the average TM-score by 6.3% and 12.5%, respectively. The results demonstrate an efficient approach to assembling multidomain structures, which can be easily used for automated, genome-scale multidomain protein structure assembly.


Asunto(s)
Proteínas/química , Reactivos de Enlaces Cruzados/química , Microscopía por Crioelectrón , Bases de Datos de Proteínas , Modelos Moleculares , Dominios Proteicos , Programas Informáticos
19.
J Xray Sci Technol ; 30(1): 1-12, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34719471

RESUMEN

High-energy, high-dose, microfocus X-ray computed tomography (HHM CT) is one of the most effective methods for high-resolution X-ray radiography inspection of high-density samples with fine structures. Minimizing the effective focal spot size of the X-ray source can significantly improve the spatial resolution and the quality of the sample images, which is critical and important for the performance of HHM CT. The objective of this study is to present a 9 MeV HHM CT prototype based on a high-average-current photo-injector in which X-rays with about 70µm focal spot size are produced via using tightly focused electron beams with 65/66µm beam size to hit an optimized tungsten target. In digital radiography (DR) experiment using this HHM CT, clear imaging of a standard 0.1 mm lead DR resolution phantom reveals a resolution of 6 lp/mm (line pairs per mm), while a 5 lp/mm resolution is obtained in CT mode using another resolution phantom made of 10 mm ferrum. Moreover, comparing with the common CT systems, a better turbine blade prototype image was obtained with this HHM CT system, which also indicates the promising application potentials of HHM CT in non-destructive inspection or testing for high-density fine-structure samples.


Asunto(s)
Intensificación de Imagen Radiográfica , Tomografía Computarizada por Rayos X , Fantasmas de Imagen , Tomografía Computarizada por Rayos X/métodos , Rayos X
20.
J Proteome Res ; 20(2): 1178-1189, 2021 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-33393786

RESUMEN

When the JCVI-syn3.0 genome was designed and implemented in 2016 as the minimal genome of a free-living organism, approximately one-third of the 438 protein-coding genes had no known function. Subsequent refinement into JCVI-syn3A led to inclusion of 16 additional protein-coding genes, including several unknown functions, resulting in an improved growth phenotype. Here, we seek to unveil the biological roles and protein-protein interaction (PPI) networks for these poorly characterized proteins using state-of-the-art deep learning contact-assisted structure prediction, followed by structure-based annotation of functions and PPI predictions. Our pipeline is able to confidently assign functions for many previously unannotated proteins such as putative vitamin transporters, which suggest the importance of nutrient uptake even in a minimized genome. Remarkably, despite the artificial selection of genes in the minimal syn3 genome, our reconstructed PPI network still shows a power law distribution of node degrees typical of naturally evolved bacterial PPI networks. Making use of our framework for combined structure/function/interaction modeling, we are able to identify both fundamental aspects of network biology that are retained in a minimal proteome and additional essential functions not yet recognized among the poorly annotated components of the syn3.0 and syn3A proteomes.


Asunto(s)
Genes Esenciales , Mapas de Interacción de Proteínas , Biología Computacional , Proteoma/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA