Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37580175

RESUMO

Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.


Assuntos
Inteligência Artificial , Engenharia de Proteínas , Processamento de Linguagem Natural , Anticorpos , Análise de Dados
2.
Chem Rev ; 122(13): 11287-11368, 2022 07 13.
Artigo em Inglês | MEDLINE | ID: mdl-35594413

RESUMO

Despite tremendous efforts in the past two years, our understanding of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), virus-host interactions, immune response, virulence, transmission, and evolution is still very limited. This limitation calls for further in-depth investigation. Computational studies have become an indispensable component in combating coronavirus disease 2019 (COVID-19) due to their low cost, their efficiency, and the fact that they are free from safety and ethical constraints. Additionally, the mechanism that governs the global evolution and transmission of SARS-CoV-2 cannot be revealed from individual experiments and was discovered by integrating genotyping of massive viral sequences, biophysical modeling of protein-protein interactions, deep mutational data, deep learning, and advanced mathematics. There exists a tsunami of literature on the molecular modeling, simulations, and predictions of SARS-CoV-2 and related developments of drugs, vaccines, antibodies, and diagnostics. To provide readers with a quick update about this literature, we present a comprehensive and systematic methodology-centered review. Aspects such as molecular biophysics, bioinformatics, cheminformatics, machine learning, and mathematics are discussed. This review will be beneficial to researchers who are looking for ways to contribute to SARS-CoV-2 studies and those who are interested in the status of the field.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Modelos Moleculares
3.
J Chem Inf Model ; 63(1): 335-342, 2023 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-36577010

RESUMO

Accurate and reliable forecasting of emerging dominant severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants enables policymakers and vaccine makers to get prepared for future waves of infections. The last three waves of SARS-CoV-2 infections caused by dominant variants, Omicron (BA.1), BA.2, and BA.4/BA.5, were accurately foretold by our artificial intelligence (AI) models built with biophysics, genotyping of viral genomes, experimental data, algebraic topology, and deep learning. On the basis of newly available experimental data, we analyzed the impacts of all possible viral spike (S) protein receptor-binding domain (RBD) mutations on the SARS-CoV-2 infectivity. Our analysis sheds light on viral evolutionary mechanisms, i.e., natural selection through infectivity strengthening and antibody resistance. We forecast that BP.1, BL*, BA.2.75*, BQ.1*, and particularly BN.1* have a high potential to become the new dominant variants to drive the next surge. Our key projection about these variants dominance made on Oct. 18, 2022 (see arXiv:2210.09485) became reality in late November 2022.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Inteligência Artificial , Anticorpos
4.
PLoS Comput Biol ; 17(6): e1009077, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34161317

RESUMO

The vertebrate hindbrain is segmented into rhombomeres (r) initially defined by distinct domains of gene expression. Previous studies have shown that noise-induced gene regulation and cell sorting are critical for the sharpening of rhombomere boundaries, which start out rough in the forming neural plate (NP) and sharpen over time. However, the mechanisms controlling simultaneous formation of multiple rhombomeres and accuracy in their sizes are unclear. We have developed a stochastic multiscale cell-based model that explicitly incorporates dynamic morphogenetic changes (i.e. convergent-extension of the NP), multiple morphogens, and gene regulatory networks to investigate the formation of rhombomeres and their corresponding boundaries in the zebrafish hindbrain. During pattern initiation, the short-range signal, fibroblast growth factor (FGF), works together with the longer-range morphogen, retinoic acid (RA), to specify all of these boundaries and maintain accurately sized segments with sharp boundaries. At later stages of patterning, we show a nonlinear change in the shape of rhombomeres with rapid left-right narrowing of the NP followed by slower dynamics. Rapid initial convergence improves boundary sharpness and segment size by regulating cell sorting and cell fate both independently and coordinately. Overall, multiple morphogens and tissue dynamics synergize to regulate the sizes and boundaries of multiple segments during development.


Assuntos
Padronização Corporal/fisiologia , Modelos Biológicos , Peixe-Zebra/embriologia , Animais , Padronização Corporal/genética , Biologia Computacional , Desenvolvimento Embrionário/genética , Desenvolvimento Embrionário/fisiologia , Fatores de Crescimento de Fibroblastos/fisiologia , Regulação da Expressão Gênica no Desenvolvimento , Substâncias de Crescimento/fisiologia , Rombencéfalo/citologia , Rombencéfalo/embriologia , Transdução de Sinais , Processos Estocásticos , Tretinoína/fisiologia , Peixe-Zebra/genética
5.
J Chem Inf Model ; 62(19): 4629-4641, 2022 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-36154171

RESUMO

Directed evolution, a revolutionary biotechnology in protein engineering, optimizes protein fitness by searching an astronomical mutational space via expensive experiments. The cluster learning-assisted directed evolution (CLADE) efficiently explores the mutational space via a combination of unsupervised hierarchical clustering and supervised learning. However, the initial-stage sampling in CLADE treats all clusters equally despite many clusters containing a large portion of non-functional mutations. Recent statistical and deep learning tools enable evolutionary density modeling to access protein fitness in an unsupervised manner. In this work, we construct an ensemble of multiple evolutionary scores to guide the initial sampling in CLADE. The resulting evolutionary score-enhanced CLADE, called CLADE 2.0, efficiently selects a training set within a small informative space using the evolution-driven clustering sampling. CLADE 2.0 is validated by using two benchmark libraries both having 160,000 sequences from four-site mutational combinations. Extensive computational experiments and comparisons with existing cutting-edge methods indicate that CLADE 2.0 is a new state-of-art tool for machine learning-assisted directed evolution.


Assuntos
Aprendizado de Máquina , Proteínas , Análise por Conglomerados , Engenharia de Proteínas/métodos , Proteínas/genética
6.
J Pharmacokinet Pharmacodyn ; 49(1): 39-50, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34637069

RESUMO

Quantitative systems pharmacology (QSP) is an important approach in pharmaceutical research and development that facilitates in silico generation of quantitative mechanistic hypotheses and enables in silico trials. As demonstrated by applications from numerous industry groups and interest from regulatory authorities, QSP is becoming an increasingly critical component in clinical drug development. With rapidly evolving computational tools and methods, QSP modeling has achieved important progress in pharmaceutical research and development, including for heart failure (HF). However, various challenges exist in the QSP modeling and clinical characterization of HF. Machine/deep learning (ML/DL) methods have had success in a wide variety of fields and disciplines. They provide data-driven approaches in HF diagnosis and modeling, and offer a novel strategy to inform QSP model development and calibration. The combination of ML/DL and QSP modeling becomes an emergent direction in the understanding of HF and clinical development new therapies. In this work, we review the current status and achievement in QSP and ML/DL for HF, and discuss remaining challenges and future perspectives in the field.


Assuntos
Insuficiência Cardíaca , Farmacologia , Calibragem , Insuficiência Cardíaca/diagnóstico , Insuficiência Cardíaca/tratamento farmacológico , Humanos , Aprendizado de Máquina , Modelos Biológicos , Farmacologia em Rede
7.
Exp Dermatol ; 28(4): 493-502, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30801791

RESUMO

Following injury, skin activates a complex wound healing programme. While cellular and signalling mechanisms of wound repair have been extensively studied, the principles of epidermal-dermal interactions and their effects on wound healing outcomes are only partially understood. To gain new insight into the effects of epidermal-dermal interactions, we developed a multiscale, hybrid mathematical model of skin wound healing. The model takes into consideration interactions between epidermis and dermis across the basement membrane via diffusible signals, defined as activator and inhibitor. Simulations revealed that epidermal-dermal interactions are critical for proper extracellular matrix deposition in the dermis, suggesting these signals may influence how wound scars form. Our model makes several theoretical predictions. First, basal levels of epidermal activator and inhibitor help to maintain dermis in a steady state, whereas their absence results in a raised, scar-like dermal phenotype. Second, wound-triggered increase in activator and inhibitor production by basal epidermal cells, coupled with fast re-epithelialization kinetics, reduces dermal scar size. Third, high-density fibrin clot leads to a raised, hypertrophic scar phenotype, whereas low-density fibrin clot leads to a hypotrophic phenotype. Fourth, shallow wounds, compared to deep wounds, result in overall reduced scarring. Taken together, our model predicts the important role of signalling across dermal-epidermal interface and the effect of fibrin clot density and wound geometry on scar formation. This hybrid modelling approach may be also applicable to other complex tissue systems, enabling the simulation of dynamic processes, otherwise computationally prohibitive with fully discrete models due to a large number of variables.


Assuntos
Derme/metabolismo , Epiderme/metabolismo , Modelos Biológicos , Cicatrização , Animais , Cicatriz/etiologia , Fibrina/metabolismo , Fibroblastos/metabolismo , Queratinócitos/metabolismo
8.
Discrete Continuous Dyn Syst Ser B ; 24(8): 3971-3994, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32269502

RESUMO

During epithelium tissue maintenance, lineages of cells differentiate and proliferate in a coordinated way to provide the desirable size and spatial organization of different types of cells. While mathematical models through deterministic description have been used to dissect role of feedback regulations on tissue layer size and stratification, how the stochastic effects influence tissue maintenance remains largely unknown. Here we present a stochastic continuum model for cell lineages to investigate how both layer thickness and layer stratification are affected by noise. We find that the cell-intrinsic noise often causes reduction and oscillation of layer size whereas the cell-extrinsic noise increases the thickness, and sometimes, leads to uncontrollable growth of the tissue layer. The layer stratification usually deteriorates as the noise level increases in the cell lineage systems. Interestingly, the morphogen noise, which mixes both cell-intrinsic noise and cell-extrinsic noise, can lead to larger size of layer with little impact on the layer stratification. By investigating different combinations of the three types of noise, we find the layer thickness variability is reduced when cell-extrinsic noise level is high or morphogen noise level is low. Interestingly, there exists a tradeoff between low thickness variability and strong layer stratification due to competition among the three types of noise, suggesting robust layer homeostasis requires balanced levels of different types of noise in the cell lineage systems.

9.
Discrete Continuous Dyn Syst Ser B ; 24(12): 6387-6417, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32405272

RESUMO

The second-order implicit integration factor method (IIF2) is effective at solving stiff reaction-diffusion equations owing to its nice stability condition. IIF has previously been applied primarily to systems in which the reaction contained no explicitly time-dependent terms and the boundary conditions were homogeneous. If applied to a system with explicitly time-dependent reaction terms, we find that IIF2 requires prohibitively small time-steps, that are relative to the square of spatial grid sizes, to attain its theoretical second-order temporal accuracy. Although the second-order implicit exponential time differencing (iETD2) method can accurately handle explicitly time-dependent reactions, it is more computationally expensive than IIF2. In this paper, we develop a hybrid approach that combines the advantages of both methods, applying IIF2 to reaction terms that are not explicitly time-dependent and applying iETD2 to those which are. The second-order hybrid IIF-ETD method (hIFE2) inherits the lower complexity of IIF2 and the ability to remain second-order accurate in time for large time-steps from iETD2. Also, it inherits the unconditional stability from IIF2 and iETD2 methods for dealing with the stiffness in reaction-diffusion systems. Through a transformation, hIFE2 can handle nonhomogeneous boundary conditions accurately and efficiently. In addition, this approach can be naturally combined with the compact and array representations of IIF and ETD for systems in higher spatial dimensions. Various numerical simulations containing linear and nonlinear reactions are presented to demonstrate the superior stability, accuracy, and efficiency of the new hIFE method.

10.
Nat Mach Intell ; 6(1): 25-39, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38274364

RESUMO

Time-series single-cell RNA sequencing (scRNA-seq) datasets provide unprecedented opportunities to learn dynamic processes of cellular systems. Due to the destructive nature of sequencing, it remains challenging to link the scRNA-seq snapshots sampled at different time points. Here we present TIGON, a dynamic, unbalanced optimal transport algorithm that reconstructs dynamic trajectories and population growth simultaneously as well as the underlying gene regulatory network from multiple snapshots. To tackle the high-dimensional optimal transport problem, we introduce a deep learning method using a dimensionless formulation based on the Wasserstein-Fisher-Rao (WFR) distance. TIGON is evaluated on simulated data and compared with existing methods for its robustness and accuracy in predicting cell state transition and cell population growth. Using three scRNA-seq datasets, we show the importance of growth in the temporal inference, TIGON's capability in reconstructing gene expression at unmeasured time points and its applications to temporal gene regulatory networks and cell-cell communication inference.

11.
Nat Comput Sci ; 3(2): 149-163, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37637776

RESUMO

While protein engineering, which iteratively optimizes protein fitness by screening the gigantic mutational space, is constrained by experimental capacity, various machine learning models have substantially expedited protein engineering. Three-dimensional protein structures promise further advantages, but their intricate geometric complexity hinders their applications in deep mutational screening. Persistent homology, an established algebraic topology tool for protein structural complexity reduction, fails to capture the homotopic shape evolution during the filtration of a given data. This work introduces a Topology-offered protein Fitness (TopFit) framework to complement protein sequence and structure embeddings. Equipped with an ensemble regression strategy, TopFit integrates the persistent spectral theory, a new topological Laplacian, and two auxiliary sequence embeddings to capture mutation-induced topological invariant, shape evolution, and sequence disparity in the protein fitness landscape. The performance of TopFit is assessed by 34 benchmark datasets with 128,634 variants, involving a vast variety of protein structure acquisition modalities and training set size variations.

12.
ArXiv ; 2023 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-37547662

RESUMO

Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.

13.
Commun Biol ; 6(1): 536, 2023 05 18.
Artigo em Inglês | MEDLINE | ID: mdl-37202415

RESUMO

Virtual screening (VS) is a critical technique in understanding biomolecular interactions, particularly in drug design and discovery. However, the accuracy of current VS models heavily relies on three-dimensional (3D) structures obtained through molecular docking, which is often unreliable due to the low accuracy. To address this issue, we introduce a sequence-based virtual screening (SVS) as another generation of VS models that utilize advanced natural language processing (NLP) algorithms and optimized deep K-embedding strategies to encode biomolecular interactions without relying on 3D structure-based docking. We demonstrate that SVS outperforms state-of-the-art performance for four regression datasets involving protein-ligand binding, protein-protein, protein-nucleic acid binding, and ligand inhibition of protein-protein interactions and five classification datasets for protein-protein interactions in five biological species. SVS has the potential to transform current practices in drug discovery and protein engineering.


Assuntos
Algoritmos , Proteínas , Simulação de Acoplamento Molecular , Ligantes , Proteínas/metabolismo , Descoberta de Drogas/métodos
14.
Comput Biol Med ; 151(Pt A): 106262, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36379191

RESUMO

Due to its high transmissibility, Omicron BA.1 ousted the Delta variant to become a dominating variant in late 2021 and was replaced by more transmissible Omicron BA.2 in March 2022. An important question is which new variants will dominate in the future. Topology-based deep learning models have had tremendous success in forecasting emerging variants in the past. However, topology is insensitive to homotopic shape evolution in virus-human protein-protein binding, which is crucial to viral evolution and transmission. This challenge is tackled with persistent Laplacian, which is able to capture both the topological change and homotopic shape evolution of data. Persistent Laplacian-based deep learning models are developed to systematically evaluate variant infectivity. Our comparative analysis of Alpha, Beta, Gamma, Delta, Lambda, Mu, and Omicron BA.1, BA.1.1, BA.2, BA.2.11, BA.2.12.1, BA.3, BA.4, and BA.5 unveils that Omicron BA.2.11, BA.2.12.1, BA.3, BA.4, and BA.5 are more contagious than BA.2. In particular, BA.4 and BA.5 are about 36% more infectious than BA.2 and are projected to become new dominant variants by natural selection. Moreover, the proposed models outperform the state-of-the-art methods on three major benchmark datasets for mutation-induced protein-protein binding free energy changes. Our key projection about BA4 and BA.5's dominance made on May 1, 2022 (see arXiv:2205.00532) became a reality in late June 2022.


Assuntos
Benchmarking , Humanos , Mutação
15.
ArXiv ; 2022 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-36299737

RESUMO

Accurate and reliable forecasting of emerging dominant severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants enables policymakers and vaccine makers to get prepared for future waves of infections. The last three waves of SARS-CoV-2 infections caused by dominant variants Omicron (BA.1), BA.2, and BA.4/BA.5 were accurately foretold by our artificial intelligence (AI) models built with biophysics, genotyping of viral genomes, experimental data, algebraic topology, and deep learning. Based on newly available experimental data, we analyzed the impacts of all possible viral spike (S) protein receptor-binding domain (RBD) mutations on the SARS-CoV-2 infectivity. Our analysis sheds light on viral evolutionary mechanisms, i.e., natural selection through infectivity strengthening and antibody resistance. We forecast that BA.2.10.4, BA.2.75, BQ.1.1, and particularly, BA.2.75+R346T, have high potential to become new dominant variants to drive the next surge.

16.
Nat Comput Sci ; 1(12): 809-818, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35811998

RESUMO

Directed evolution, a strategy for protein engineering, optimizes protein properties (i.e., fitness) by expensive and time-consuming screening or selection of large mutational sequence space. Machine learning-assisted directed evolution (MLDE), which screens sequence properties in silico, can accelerate the optimization and reduce the experimental burden. This work introduces a MLDE framework, cluster learning-assisted directed evolution (CLADE), that combines hierarchical unsupervised clustering sampling and supervised learning to guide protein engineering. The clustering sampling selectively picks and screens variants in targeted subspaces, which guides the subsequent generation of diverse training sets. In the last stage, accurate predictions via supervised learning models improve final outcomes. By sequentially screening 480 sequences out of 160,000 in a four-site combinatorial library with five equal experimental batches, CLADE achieves the global maximal fitness hit rate up to 91.0% and 34.0% for GB1 and PhoQ datasets, respectively, improved from 18.6% and 7.2% obtained by random-sampling-based MLDE.

17.
Dev Cell ; 53(6): 724-739.e14, 2020 06 22.
Artigo em Inglês | MEDLINE | ID: mdl-32574592

RESUMO

Gradients of decapentaplegic (Dpp) pattern Drosophila wing imaginal discs, establishing gene expression boundaries at specific locations. As discs grow, Dpp gradients expand, keeping relative boundary positions approximately stationary. Such scaling fails in mutants for Pentagone (pent), a gene repressed by Dpp that encodes a diffusible protein that expands Dpp gradients. Although these properties fit a recent mathematical model of automatic gradient scaling, that model requires an expander that spreads with minimal loss throughout a morphogen field. Here, we show that Pent's actions are confined to within just a few cell diameters of its site of synthesis and can be phenocopied by manipulating non-diffusible Pent targets strictly within the Pent expression domain. Using genetics and mathematical modeling, we develop an alternative model of scaling driven by feedback downregulation of Dpp receptors and co-receptors. Among the model's predictions is a size beyond which scaling fails-something we observe directly in wing discs.


Assuntos
Proteínas de Drosophila/genética , Proteínas da Matriz Extracelular/genética , Regulação da Expressão Gênica no Desenvolvimento , Morfogênese , Animais , Regulação para Baixo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster , Proteínas da Matriz Extracelular/metabolismo , Retroalimentação Fisiológica , Discos Imaginais/embriologia , Discos Imaginais/metabolismo , Modelos Teóricos
18.
Zhongguo Zhen Jiu ; 37(2): 215-218, 2017 Feb 12.
Artigo em Zh | MEDLINE | ID: mdl-29231491

RESUMO

Acupuncture expectation refers to the subjective estimation for the effect of acupuncture to be applied. As sham acupuncture is usually used in acupuncture randomized clinical trials,there exists the effect of acupuncture expectation on subjects. It is necessary to evaluate and standardize it. The factors that influence the evaluation standard of acupuncture expectations are different acupuncture expectation value evaluations,evaluation criterions and time points. They will affect the evaluation of clinical efficacy. It is urgent to establish a unified evaluation standard to improve its reliability.


Assuntos
Terapia por Acupuntura/psicologia , Humanos , Efeito Placebo , Ensaios Clínicos Controlados Aleatórios como Assunto , Reprodutibilidade dos Testes , Resultado do Tratamento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA