Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
1.
Angew Chem Int Ed Engl ; : e202400441, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38587149

RESUMO

Nickel-catalyzed transannulation reactions triggered by the extrusion of small gaseous molecules have emerged as a powerful strategy for the efficient construction of heterocyclic compounds. However, their use in asymmetric synthesis remains challenging because of the difficulty in controlling stereo- and regioselectivity. Herein, we report the first nickel-catalyzed asymmetric synthesis of N-N atropisomers by the denitrogenative transannulation of benzotriazones with alkynes. A broad range of N-N atropisomers was obtained with excellent regio- and enantioselectivity under mild conditions. Moreover, density functional theory (DFT) calculations provided insights into the nickel-catalyzed reaction mechanism and enantioselectivity control.

2.
Comput Biol Med ; 172: 108227, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38460308

RESUMO

Accurately predicting protein-ATP binding residues is critical for protein function annotation and drug discovery. Computational methods dedicated to the prediction of binding residues based on protein sequence information have exhibited notable advancements in predictive accuracy. Nevertheless, these methods continue to grapple with several formidable challenges, including limited means of extracting more discriminative features and inadequate algorithms for integrating protein and residue information. To address the problems, we propose ATP-Deep, a novel protein-ATP binding residues predictor. ATP-Deep harnesses the capabilities of unsupervised pre-trained language models and incorporates domain-specific evolutionary context information from homologous sequences. It further refines the embedding at the residue level through integration with corresponding protein-level information and employs a contextual-based co-attention mechanism to adeptly fuse multiple sources of features. The performance evaluation results on the benchmark datasets reveal that ATP-Deep achieves an AUC of 0.954 and 0.951, respectively, surpassing the performance of the state-of-the-art model. These findings underscore the effectiveness of assimilating protein-level information and deploying a contextual-based co-attention mechanism grounded in context to bolster the prediction performance of protein-ATP binding residues.


Assuntos
Algoritmos , Proteínas , Ligação Proteica , Proteínas/química , Sequência de Aminoácidos , Trifosfato de Adenosina
3.
J Chem Inf Model ; 64(4): 1407-1418, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38334115

RESUMO

Studying the effect of single amino acid variations (SAVs) on protein structure and function is integral to advancing our understanding of molecular processes, evolutionary biology, and disease mechanisms. Screening for deleterious variants is one of the crucial issues in precision medicine. Here, we propose a novel computational approach, TransEFVP, based on large-scale protein language model embeddings and a transformer-based neural network to predict disease-associated SAVs. The model adopts a two-stage architecture: the first stage is designed to fuse different feature embeddings through a transformer encoder. In the second stage, a support vector machine model is employed to quantify the pathogenicity of SAVs after dimensionality reduction. The prediction performance of TransEFVP on blind test data achieves a Matthews correlation coefficient of 0.751, an F1-score of 0.846, and an area under the receiver operating characteristic curve of 0.871, higher than the existing state-of-the-art methods. The benchmark results demonstrate that TransEFVP can be explored as an accurate and effective SAV pathogenicity prediction method. The data and codes for TransEFVP are available at https://github.com/yzh9607/TransEFVP/tree/master for academic use.


Assuntos
Algoritmos , Proteínas , Humanos , Proteínas/química , Sequência de Aminoácidos , Redes Neurais de Computação , Aminoácidos
4.
J Chem Inf Model ; 64(4): 1394-1406, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38349747

RESUMO

Nonsynonymous single-nucleotide polymorphisms (nsSNPs), implicated in over 6000 diseases, necessitate accurate prediction for expedited drug discovery and improved disease diagnosis. In this study, we propose FCMSTrans, a novel nsSNP predictor that innovatively combines the transformer framework and multiscale modules for comprehensive feature extraction. The distinctive attribute of FCMSTrans resides in a deep feature combination strategy. This strategy amalgamates evolutionary-scale modeling (ESM) and ProtTrans (PT) features, providing an understanding of protein biochemical properties, and position-specific scoring matrix, secondary structure, predicted relative solvent accessibility, and predicted disorder (PSPP) features, which are derived from four protein sequences and structure-oriented characteristics. This feature combination offers a comprehensive view of the molecular dynamics involving nsSNPs. Our model employs the transformer's self-attention mechanisms across multiple layers, extracting higher-level and abstract representations. Simultaneously, varied-level features are captured by multiscale convolutions, enriching feature abstraction at multiple echelons. Our comparative analyses with existing methodologies highlight significant improvements made possible by the integrated feature fusion approach adopted in FCMSTrans. This is further substantiated by performance assessments based on diverse data sets, such as PredictSNP, MMP, and PMD, with areas under the curve (AUCs) of 0.869, 0.819, and 0.693, respectively. Furthermore, FCMSTrans shows robustness and superiority by outperforming the current best predictor, PROVEAN, in a blind test conducted on a third-party data set, achieving an impressive AUC score of 0.7838. The Python code of FCMSTrans is available at https://github.com/gc212/FCMSTrans for academic usage.


Assuntos
Descoberta de Drogas , Fontes de Energia Elétrica , Sequência de Aminoácidos , Área Sob a Curva , Polimorfismo de Nucleotídeo Único
5.
Int J Biol Macromol ; 260(Pt 1): 129245, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38191109

RESUMO

Aerogels with low thermal conductivity and high adsorption capacity present a promising solution to curb water pollution caused by organic reagents as well as mitigate heat loss. Although aerogels exhibiting good adsorption capacity and thermal insulation have been reported, materials with mechanical integrity, high flexibility and shear resistance still pose a formidable task. Here, we produced bacterial cellulose-based ultralight multifunctional hybrid aerogels by using freeze-drying followed by chemical vapor deposition silylation method. The hybrid aerogels displayed a low density of 10-15 mg/cm3, high porosity exceeding 99.1 %, low thermal conductivity (27.3-29.2 mW/m.K) and superior hydrophobicity (water contact angle>120o). They also exhibited excellent mechanical properties including superelasticity, high flexibility and shear resistance. The hybrid aerogels demonstrated high heat shielding efficiency when used as an insulating material. As a selective oil absorbent, the hybrid aerogels exhibit a maximum adsorption capacity of up to approximately 156 times its own weight and excellent recoverability. Especially, the aerogel's highly accessible porous microstructure results in an impressive flux rate of up to 162 L/h.g when used as a filter in a continuous oil-water separator to isolate n-hexane-water mixtures. This work presents a novel endeavor to create high-performance, sustainable, reusable, and adaptable multifunctional aerogels.


Assuntos
Celulose , Gases , Adsorção , Liofilização , Temperatura Alta
6.
ACS Omega ; 9(2): 2032-2047, 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38250421

RESUMO

Genetic variations (including substitutions, insertions, and deletions) exert a profound influence on DNA sequences. These variations are systematically classified as synonymous, nonsynonymous, and nonsense, each manifesting distinct effects on proteins. The implementation of high-throughput sequencing has significantly augmented our comprehension of the intricate interplay between gene variations and protein structure and function, as well as their ramifications in the context of diseases. Frameshift variations, particularly small insertions and deletions (indels), disrupt protein coding and are instrumental in disease pathogenesis. This review presents a succinct review of computational methods, databases, current challenges, and future directions in predicting the consequences of coding frameshift small indels variations. We analyzed the predictive efficacy, reliability, and utilization of computational methods and variant account, reliability, and utilization of database. Besides, we also compared the prediction methodologies on GOF/LOF pathogenic variation data. Addressing the challenges pertaining to prediction accuracy and cross-species generalizability, nascent technologies such as AI and deep learning harbor immense potential to enhance predictive capabilities. The importance of interdisciplinary research and collaboration cannot be overstated for devising effective diagnosis, treatment, and prevention strategies concerning diseases associated with coding frameshift indels variations.

7.
Am J Obstet Gynecol ; 230(4): 390-402, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38072372

RESUMO

OBJECTIVE: This study aimed to provide procedure-specific estimates of the risk for symptomatic venous thromboembolism and major bleeding in noncancer gynecologic surgeries. DATA SOURCES: We conducted comprehensive searches on Embase, MEDLINE, Web of Science, and Google Scholar. Furthermore, we performed separate searches for randomized trials that addressed the effects of thromboprophylaxis. STUDY ELIGIBILITY CRITERIA: Eligible studies were observational studies that enrolled ≥50 adult patients who underwent noncancer gynecologic surgery procedures and that reported the absolute incidence of at least 1 of the following: symptomatic pulmonary embolism, symptomatic deep vein thrombosis, symptomatic venous thromboembolism, bleeding that required reintervention (including re-exploration and angioembolization), bleeding that led to transfusion, or postoperative hemoglobin level <70 g/L. METHODS: A teams of 2 reviewers independently assessed eligibility, performed data extraction, and evaluated the risk of bias of the eligible articles. We adjusted the reported estimates for thromboprophylaxis and length of follow-up and used the median value from studies to determine the cumulative incidence at 4 weeks postsurgery stratified by patient venous thromboembolism risk factors and used the Grading of Recommendations Assessment, Development and Evaluation approach to rate the evidence certainty. RESULTS: We included 131 studies (1,741,519 patients) that reported venous thromboembolism risk estimates for 50 gynecologic noncancer procedures and bleeding requiring reintervention estimates for 35 procedures. The evidence certainty was generally moderate or low for venous thromboembolism and low or very low for bleeding requiring reintervention. The risk for symptomatic venous thromboembolism varied from a median of <0.1% for several procedures (eg, transvaginal oocyte retrieval) to 1.5% for others (eg, minimally invasive sacrocolpopexy with hysterectomy, 1.2%-4.6% across patient venous thromboembolism risk groups). Venous thromboembolism risk was <0.5% for 30 (60%) of the procedures; 0.5% to 1.0% for 10 (20%) procedures; and >1.0% for 10 (20%) procedures. The risk for bleeding the require reintervention varied from <0.1% (transvaginal oocyte retrieval) to 4.0% (open myomectomy). The bleeding requiring reintervention risk was <0.5% in 17 (49%) procedures, 0.5% to 1.0% for 12 (34%) procedures, and >1.0% in 6 (17%) procedures. CONCLUSION: The risk for venous thromboembolism in gynecologic noncancer surgery varied between procedures and patients. Venous thromboembolism risks exceeded the bleeding risks only among selected patients and procedures. Although most of the evidence is of low certainty, the results nevertheless provide a compelling rationale for restricting pharmacologic thromboprophylaxis to a minority of patients who undergo gynecologic noncancer procedures.


Assuntos
Trombose , Tromboembolia Venosa , Adulto , Humanos , Feminino , Anticoagulantes/uso terapêutico , Tromboembolia Venosa/prevenção & controle , Complicações Pós-Operatórias/prevenção & controle , Hemorragia/induzido quimicamente , Procedimentos Cirúrgicos em Ginecologia/efeitos adversos
8.
Ann Surg ; 279(2): 213-225, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-37551583

RESUMO

OBJECTIVE: To provide procedure-specific estimates of symptomatic venous thromboembolism (VTE) and major bleeding after abdominal surgery. BACKGROUND: The use of pharmacological thromboprophylaxis represents a trade-off that depends on VTE and bleeding risks that vary between procedures; their magnitude remains uncertain. METHODS: We identified observational studies reporting procedure-specific risks of symptomatic VTE or major bleeding after abdominal surgery, adjusted the reported estimates for thromboprophylaxis and length of follow-up, and estimated cumulative incidence at 4 weeks postsurgery, stratified by VTE risk groups, and rated evidence certainty. RESULTS: After eligibility screening, 285 studies (8,048,635 patients) reporting on 40 general abdominal, 36 colorectal, 15 upper gastrointestinal, and 24 hepatopancreatobiliary surgery procedures proved eligible. Evidence certainty proved generally moderate or low for VTE and low or very low for bleeding requiring reintervention. The risk of VTE varied substantially among procedures: in general abdominal surgery from a median of <0.1% in laparoscopic cholecystectomy to a median of 3.7% in open small bowel resection, in colorectal from 0.3% in minimally invasive sigmoid colectomy to 10.0% in emergency open total proctocolectomy, and in upper gastrointestinal/hepatopancreatobiliary from 0.2% in laparoscopic sleeve gastrectomy to 6.8% in open distal pancreatectomy for cancer. CONCLUSIONS: VTE thromboprophylaxis provides net benefit through VTE reduction with a small increase in bleeding in some procedures (eg, open colectomy and open pancreaticoduodenectomy), whereas the opposite is true in others (eg, laparoscopic cholecystectomy and elective groin hernia repairs). In many procedures, thromboembolism and bleeding risks are similar, and decisions depend on individual risk prediction and values and preferences regarding VTE and bleeding.


Assuntos
Neoplasias Colorretais , Trombose , Tromboembolia Venosa , Humanos , Anticoagulantes/uso terapêutico , Neoplasias Colorretais/tratamento farmacológico , Hemorragia , Complicações Pós-Operatórias/epidemiologia , Complicações Pós-Operatórias/prevenção & controle , Complicações Pós-Operatórias/tratamento farmacológico , Tromboembolia Venosa/epidemiologia , Tromboembolia Venosa/etiologia , Tromboembolia Venosa/prevenção & controle
9.
Am J Obstet Gynecol ; 230(4): 403-416, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37827272

RESUMO

OBJECTIVE: This study aimed to provide procedure-specific estimates of the risk of symptomatic venous thromboembolism and major bleeding in the absence of thromboprophylaxis, following gynecologic cancer surgery. DATA SOURCES: We conducted comprehensive searches on Embase, MEDLINE, Web of Science, and Google Scholar for observational studies. We also reviewed reference lists of eligible studies and review articles. We performed separate searches for randomized trials addressing effects of thromboprophylaxis and conducted a web-based survey on thromboprophylaxis practice. STUDY ELIGIBILITY CRITERIA: Observational studies enrolling ≥50 adult patients undergoing gynecologic cancer surgery procedures reporting absolute incidence for at least 1 of the following were included: symptomatic pulmonary embolism, symptomatic deep vein thrombosis, symptomatic venous thromboembolism, bleeding requiring reintervention (including reexploration and angioembolization), bleeding leading to transfusion, or postoperative hemoglobin <70 g/L. METHODS: Two reviewers independently assessed eligibility, performed data extraction, and evaluated risk of bias of eligible articles. We adjusted the reported estimates for thromboprophylaxis and length of follow-up and used the median value from studies to determine cumulative incidence at 4 weeks postsurgery stratified by patient venous thromboembolism risk factors. The GRADE approach was applied to rate evidence certainty. RESULTS: We included 188 studies (398,167 patients) reporting on 37 gynecologic cancer surgery procedures. The evidence certainty was generally low to very low. Median symptomatic venous thromboembolism risk (in the absence of prophylaxis) was <1% in 13 of 37 (35%) procedures, 1% to 2% in 11 of 37 (30%), and >2.0% in 13 of 37 (35%). The risks of venous thromboembolism varied from 0.1% in low venous thromboembolism risk patients undergoing cervical conization to 33.5% in high venous thromboembolism risk patients undergoing pelvic exenteration. Estimates of bleeding requiring reintervention varied from <0.1% to 1.3%. Median risks of bleeding requiring reintervention were <1% in 22 of 29 (76%) and 1% to 2% in 7 of 29 (24%) procedures. CONCLUSION: Venous thromboembolism reduction with thromboprophylaxis likely outweighs the increase in bleeding requiring reintervention in many gynecologic cancer procedures (eg, open surgery for ovarian cancer and pelvic exenteration). In some procedures (eg, laparoscopic total hysterectomy without lymphadenectomy), thromboembolism and bleeding risks are similar, and decisions depend on individual risk prediction and values and preferences regarding venous thromboembolism and bleeding.


Assuntos
Neoplasias , Trombose , Tromboembolia Venosa , Adulto , Humanos , Feminino , Anticoagulantes/uso terapêutico , Tromboembolia Venosa/epidemiologia , Tromboembolia Venosa/prevenção & controle , Complicações Pós-Operatórias/prevenção & controle , Hemorragia
10.
J Chem Inf Model ; 63(22): 7239-7257, 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-37947586

RESUMO

Understanding the pathogenicity of missense mutation (MM) is essential for shed light on genetic diseases, gene functions, and individual variations. In this study, we propose a novel computational approach, called MMPatho, for enhancing missense mutation pathogenic prediction. First, we established a large-scale nonredundant MM benchmark data set based on the entire Ensembl database, complemented by a focused blind test set specifically for pathogenic GOF/LOF MM. Based on this data set, for each mutation, we utilized Ensembl VEP v104 and dbNSFP v4.1a to extract variant-level, amino acid-level, individuals' outputs, and genome-level features. Additionally, protein sequences were generated using ENSP identifiers with the Ensembl API, and then encoded. The mutant sites' ESM-1b and ProtTrans-T5 embeddings were subsequently extracted. Then, our model group (MMPatho) was developed by leveraging upon these efforts, which comprised ConsMM and EvoIndMM. To be specific, ConsMM employs individuals' outputs and XGBoost with SHAP explanation analysis, while EvoIndMM investigates the potential enhancement of predictive capability by incorporating evolutionary information from ESM-1b and ProtT5-XL-U50, large protein language embeddings. Through rigorous comparative experiments, both ConsMM and EvoIndMM were capable of achieving remarkable AUROC (0.9836 and 0.9854) and AUPR (0.9852 and 0.9902) values on the blind test set devoid of overlapping variations and proteins from the training data, thus highlighting the superiority of our computational approach in the prediction of MM pathogenicity. Our Web server, available at http://csbio.njust.edu.cn/bioinf/mmpatho/, allows researchers to predict the pathogenicity (alongside the reliability index score) of MMs using the ConsMM and EvoIndMM models and provides extensive annotations for user input. Additionally, the newly constructed benchmark data set and blind test set can be accessed via the data page of our web server.


Assuntos
Biologia Computacional , Mutação de Sentido Incorreto , Humanos , Reprodutibilidade dos Testes , Consenso , Proteínas
11.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 3205-3214, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37289599

RESUMO

It has been demonstrated that RNA modifications play essential roles in multiple biological processes. Accurate identification of RNA modifications in the transcriptome is critical for providing insights into the biological functions and mechanisms. Many tools have been developed for predicting RNA modifications at single-base resolution, which employ conventional feature engineering methods that focus on feature design and feature selection processes that require extensive biological expertise and may introduce redundant information. With the rapid development of artificial intelligence technologies, end-to-end methods are favorably received by researchers. Nevertheless, each well-trained model is only suitable for a specific RNA methylation modification type for nearly all of these approaches. In this study, we present MRM-BERT by feeding task-specific sequences into the powerful BERT (Bidirectional Encoder Representations from Transformers) model and implementing fine-tuning, which exhibits competitive performance to the state-of-the-art methods. MRM-BERT avoids repeated de novo training of the model and can predict multiple RNA modifications such as pseudouridine, m6A, m5C, and m1A in Mus musculus, Arabidopsis thaliana, and Saccharomyces cerevisiae. In addition, we analyse the attention heads to provide high attention regions for the prediction, and conduct saturated in silico mutagenesis of the input sequences to discover potential changes of RNA modifications, which can better assist researchers in their follow-up research.


Assuntos
Arabidopsis , Inteligência Artificial , Camundongos , Animais , Pseudouridina , Arabidopsis/genética , Transcriptoma , Saccharomyces cerevisiae/genética , RNA/genética
12.
Materials (Basel) ; 16(12)2023 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-37374571

RESUMO

Ductility-based structural design is currently the mainstream method. In order to analyze the ductility performance of concrete columns with high-strength steel reinforcements under eccentric compression, corresponding experimental studies have been performed. Numerical models were established, and their reliability was verified. Based on the numerical models, the parameter analysis was carried out, where eccentricity, concrete strength, and reinforcement ratio were considered to systematically discuss the ductility of the concrete column section with high-strength steel reinforcement. The results show that the ductility of the section under eccentric compression increases with the strength of the concrete and eccentricity, and decreases with the reinforcement ratio. Finally, a simplified calculation formula capable of quantitatively evaluating the section ductility was proposed.

13.
Huan Jing Ke Xue ; 44(4): 2093-2102, 2023 Apr 08.
Artigo em Chinês | MEDLINE | ID: mdl-37040959

RESUMO

To reveal the characteristics and key impact factors of phytoplankton communities in different types of lakes, sampling surveys for phytoplankton and water quality parameters were conducted at 174 sampling sites in a total of 24 lakes covering urban, countryside, and ecological conservation areas of Wuhan in spring, summer, autumn, and winter 2018. The results showed that a total of 365 species of phytoplankton from nine phyla and 159 genera were identified in the three types of lakes. The main species were green algae, cyanobacteria, and diatoms, accounting for 55.34%, 15.89%, and 15.07% of the total number of species, respectively. The phytoplankton cell density varied from 3.60×106-421.99×106 cell·L-1, chlorophyll-a content varied from 15.60-240.50 µg·L-1, biomass varied from 27.71-379.79 mg·L-1, and the Shannon-Wiener diversity index varied from 0.29-2.86. In the three lake types, cell density, Chla, and biomass were lower in EL and UL, whereas the opposite was true for the Shannon-Wiener diversity index. NMDS and ANOSIM analysis showed differences in phytoplankton community structure (Stress=0.13, R=0.048, P=0.2298). In addition, the phytoplankton community structure of the three lake types had significant seasonal characteristics, with chlorophyll-a content and biomass being significantly higher in summer than in winter (P<0.05). Spearman correlation analysis showed that phytoplankton biomass decreased with increasing N:P in UL and CL, whereas the opposite was true for EL. Redundancy analysis (RDA) showed that WT, pH, NO3-, EC, and N:P were the key factors that significantly affected the variability in phytoplankton community structure in the three types of lakes in Wuhan (P<0.05).


Assuntos
Cianobactérias , Diatomáceas , Fitoplâncton , Lagos/análise , Clorofila/análise , Clorofila A
15.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36528806

RESUMO

Determining the pathogenicity and functional impact (i.e. gain-of-function; GOF or loss-of-function; LOF) of a variant is vital for unraveling the genetic level mechanisms of human diseases. To provide a 'one-stop' framework for the accurate identification of pathogenicity and functional impact of variants, we developed a two-stage deep-learning-based computational solution, termed VPatho, which was trained using a total of 9619 pathogenic GOF/LOF and 138 026 neutral variants curated from various databases. A total number of 138 variant-level, 262 protein-level and 103 genome-level features were extracted for constructing the models of VPatho. The development of VPatho consists of two stages: (i) a random under-sampling multi-scale residual neural network (ResNet) with a newly defined weighted-loss function (RUS-Wg-MSResNet) was proposed to predict variants' pathogenicity on the gnomAD_NV + GOF/LOF dataset; and (ii) an XGBOD model was constructed to predict the functional impact of the given variants. Benchmarking experiments demonstrated that RUS-Wg-MSResNet achieved the highest prediction performance with the weights calculated based on the ratios of neutral versus pathogenic variants. Independent tests showed that both RUS-Wg-MSResNet and XGBOD achieved outstanding performance. Moreover, assessed using variants from the CAGI6 competition, RUS-Wg-MSResNet achieved superior performance compared to state-of-the-art predictors. The fine-trained XGBOD models were further used to blind test the whole LOF data downloaded from gnomAD and accordingly, we identified 31 nonLOF variants that were previously labeled as LOF/uncertain variants. As an implementation of the developed approach, a webserver of VPatho is made publicly available at http://csbio.njust.edu.cn/bioinf/vpatho/ to facilitate community-wide efforts for profiling and prioritizing the query variants with respect to their pathogenicity and functional impact.


Assuntos
Aprendizado Profundo , Humanos , Mutação com Ganho de Função , Genoma
16.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36094083

RESUMO

Short open reading frames (sORFs) refer to the small nucleic fragments no longer than 303 nt in length that probably encode small peptides. To date, translatable sORFs have been found in both untranslated regions of messenger ribonucleic acids (RNAs; mRNAs) and long non-coding RNAs (lncRNAs), playing vital roles in a myriad of biological processes. As not all sORFs are translated or essentially translatable, it is important to develop a highly accurate computational tool for characterizing the coding potential of sORFs, thereby facilitating discovery of novel functional peptides. In light of this, we designed a series of ensemble models by integrating Efficient-CapsNet and LightGBM, collectively termed csORF-finder, to differentiate the coding sORFs (csORFs) from non-coding sORFs in Homo sapiens, Mus musculus and Drosophila melanogaster, respectively. To improve the performance of csORF-finder, we introduced a novel feature encoding scheme named trinucleotide deviation from expected mean (TDE) and computed all types of in-frame sequence-based features, such as i-framed-3mer, i-framed-CKSNAP and i-framed-TDE. Benchmarking results showed that these features could significantly boost the performance compared to the original 3-mer, CKSNAP and TDE features. Our performance comparisons showed that csORF-finder achieved a superior performance than the state-of-the-art methods for csORF prediction on multi-species and non-ATG initiation independent test datasets. Furthermore, we applied csORF-finder to screen the lncRNA datasets for identifying potential csORFs. The resulting data serve as an important computational repository for further experimental validation. We hope that csORF-finder can be exploited as a powerful platform for high-throughput identification of csORFs and functional characterization of these csORFs encoded peptides.


Assuntos
Fases de Leitura Aberta , RNA Longo não Codificante , Animais , Camundongos , Drosophila melanogaster/genética , Aprendizado de Máquina , Peptídeos/genética , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Humanos
17.
J Chem Inf Model ; 62(17): 4270-4282, 2022 09 12.
Artigo em Inglês | MEDLINE | ID: mdl-35973091

RESUMO

An essential step in engineering proteins and understanding disease-causing missense mutations is to accurately model protein stability changes when such mutations occur. Here, we developed a new sequence-based predictor for the protein stability (PROST) change (Gibb's free energy change, ΔΔG) upon a single-point missense mutation. PROST extracts multiple descriptors from the most promising sequence-based predictors, such as BoostDDG, SAAFEC-SEQ, and DDGun. RPOST also extracts descriptors from iFeature and AlphaFold2. The extracted descriptors include sequence-based features, physicochemical properties, evolutionary information, evolutionary-based physicochemical properties, and predicted structural features. The PROST predictor is a weighted average ensemble model based on extreme gradient boosting (XGBoost) decision trees and an extra-trees regressor; PROST is trained on both direct and hypothetical reverse mutations using the S5294 (S2647 direct mutations + S2647 inverse mutations). The parameters for the PROST model are optimized using grid searching with 5-fold cross-validation, and feature importance analysis unveils the most relevant features. The performance of PROST is evaluated in a blinded manner, employing nine distinct data sets and existing state-of-the-art sequence-based and structure-based predictors. This method consistently performs well on frataxin, S217, S349, Ssym, S669, Myoglobin, and CAGI5 data sets in blind tests and similarly to the state-of-the-art predictors for p53 and S276 data sets. When the performance of PROST is compared with the latest predictors such as BoostDDG, SAAFEC-SEQ, ACDC-NN-seq, and DDGun, PROST dominates these predictors. A case study of mutation scanning of the frataxin protein for nine wild-type residues demonstrates the utility of PROST. Taken together, these findings indicate that PROST is a well-suited predictor when no protein structural information is available. The source code of PROST, data sets, examples, and pretrained models along with how to use PROST are available at https://github.com/ShahidIqb/PROST and https://prost.erc.monash.edu/seq.


Assuntos
Mutação de Sentido Incorreto , Transferência Intratubária do Zigoto , Estabilidade Proteica , Proteínas/química , Software
18.
Org Lett ; 24(17): 3138-3143, 2022 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-35452582

RESUMO

We report herein that copper(I) catalysis using a bis(phosphine) dioxide ligand can catalyze the desymmetric C-H arylation of prochiral bipyrroles. More than 50 nitrogen-nitrogen atropisomers were achieved in good to excellent yields with excellent enantioselectivities (≤97% yield, ≤98% ee). The reaction proceeds under mild conditions with good functional group compatibility on arenes and diaryliodonium salts. Moreover, this principle enables iterative arylation of the bipyrroles to enantioselectively arylate different positions during the catalysis of copper.

19.
IEEE/ACM Trans Comput Biol Bioinform ; 19(5): 2749-2759, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34347603

RESUMO

Cell-penetrating peptides (CPPs) are special peptides capable of carrying a variety of bioactive molecules, such as genetic materials, short interfering RNAs and nanoparticles, into cells. Recently, research on CPP has gained substantial interest from researchers, and the biological mechanisms of CPPS have been assessed in the context of safe drug delivery agents and therapeutic applications. Correct identification and synthesis of CPPs using traditional biochemical methods is an extremely slow, expensive and laborious task particularly due to the large volume of unannotated peptide sequences accumulating in the World Bank repository. Hence, a powerful bioinformatics predictor that rapidly identifies CPPs with a high recognition rate is urgently needed. To date, numerous computational methods have been developed for CPP prediction. However, the available machine-learning (ML) tools are unable to distinguish both the CPPs and their uptake efficiencies. This study aimed to develop a two-layer deep learning framework named DeepCPPred to identify both CPPs in the first phase and peptide uptake efficiency in the second phase. The DeepCPPred predictor first uses four types of descriptors that cover evolutionary, energy estimation, reduced sequence and amino-acid contact information. Then, the extracted features are optimized through the elastic net algorithm and fed into a cascade deep forest algorithm to build the final CPP model. The proposed method achieved 99.45 percent overall accuracy with the CPP924 benchmark dataset in the first layer and 95.43 percent accuracy in the second layer with the CPPSite3 dataset using a 5-fold cross-validation test. Thus, our proposed bioinformatics tool surpassed all the existing state-of-the-art sequence-based CPP approaches.


Assuntos
Peptídeos Penetradores de Células , Aprendizado Profundo , Sequência de Aminoácidos , Peptídeos Penetradores de Células/química , Biologia Computacional/métodos , Aprendizado de Máquina
20.
Artigo em Inglês | MEDLINE | ID: mdl-33280588

RESUMO

AIM AND OBJECTIVE: Missense mutation (MM) may lead to various human diseases by disabling proteins. Accurate prediction of MM is important and challenging for both protein function annotation and drug design. Although several computational methods yielded acceptable success rates, there is still room for further enhancing the prediction performance of MM. MATERIALS AND METHODS: In the present study, we designed a new feature extracting method, which considers the impact degree of residues in the microenvironment range to the mutation site. Stringent cross-validation and independent test on benchmark datasets were performed to evaluate the efficacy of the proposed feature extracting method. Furthermore, three heterogeneous prediction models were trained and then ensembled for the final prediction. By combining the feature representation method and classifier ensemble technique, we reported a novel MM predictor called TargetMM for identifying the pathogenic mutations from the neutral ones. RESULTS: Comparison outcomes based on statistical evaluation demonstrate that TargetMM outperforms the prior advanced methods on the independent test data. The source codes and benchmark datasets of TargetMM are freely available at https://github.com/sera616/TargetMM.git for academic use.


Assuntos
Algoritmos , Mutação de Sentido Incorreto , Humanos , Proteínas/química , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA