Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Proc Natl Acad Sci U S A ; 121(27): e2311888121, 2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-38913887

RESUMEN

The prediction of protein 3D structure from amino acid sequence is a computational grand challenge in biophysics and plays a key role in robust protein structure prediction algorithms, from drug discovery to genome interpretation. The advent of AI models, such as AlphaFold, is revolutionizing applications that depend on robust protein structure prediction algorithms. To maximize the impact, and ease the usability, of these AI tools we introduce APACE, AlphaFold2 and advanced computing as a service, a computational framework that effectively handles this AI model and its TB-size database to conduct accelerated protein structure prediction analyses in modern supercomputing environments. We deployed APACE in the Delta and Polaris supercomputers and quantified its performance for accurate protein structure predictions using four exemplar proteins: 6AWO, 6OAN, 7MEZ, and 6D6U. Using up to 300 ensembles, distributed across 200 NVIDIA A100 GPUs, we found that APACE is up to two orders of magnitude faster than off-the-self AlphaFold2 implementations, reducing time-to-solution from weeks to minutes. This computational approach may be readily linked with robotics laboratories to automate and accelerate scientific discovery.


Asunto(s)
Algoritmos , Biofisica , Proteínas , Proteínas/química , Biofisica/métodos , Conformación Proteica , Programas Informáticos , Biología Computacional/métodos , Modelos Moleculares
2.
Proc Natl Acad Sci U S A ; 121(27): e2311808121, 2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-38913886

RESUMEN

Modeling complex physical dynamics is a fundamental task in science and engineering. Traditional physics-based models are first-principled, explainable, and sample-efficient. However, they often rely on strong modeling assumptions and expensive numerical integration, requiring significant computational resources and domain expertise. While deep learning (DL) provides efficient alternatives for modeling complex dynamics, they require a large amount of labeled training data. Furthermore, its predictions may disobey the governing physical laws and are difficult to interpret. Physics-guided DL aims to integrate first-principled physical knowledge into data-driven methods. It has the best of both worlds and is well equipped to better solve scientific problems. Recently, this field has gained great progress and has drawn considerable interest across discipline Here, we introduce the framework of physics-guided DL with a special emphasis on learning dynamical systems. We describe the learning pipeline and categorize state-of-the-art methods under this framework. We also offer our perspectives on the open challenges and emerging opportunities.

3.
Proc Natl Acad Sci U S A ; 121(25): e2321440121, 2024 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-38875143

RESUMEN

In recent decades, a growing number of discoveries in mathematics have been assisted by computer algorithms, primarily for exploring large parameter spaces. As computers become more powerful, an intriguing possibility arises-the interplay between human intuition and computer algorithms can lead to discoveries of mathematical structures that would otherwise remain elusive. Here, we demonstrate computer-assisted discovery of a previously unknown mathematical structure, the conservative matrix field. In the spirit of the Ramanujan Machine project, we developed a massively parallel computer algorithm that found a large number of formulas, in the form of continued fractions, for numerous mathematical constants. The patterns arising from those formulas enabled the construction of the first conservative matrix fields and revealed their overarching properties. Conservative matrix fields unveil unexpected relations between different mathematical constants, such as π and ln(2), or e and the Gompertz constant. The importance of these matrix fields is further realized by their ability to connect formulas that do not have any apparent relation, thus unifying hundreds of existing formulas and generating infinitely many new formulas. We exemplify these implications on values of the Riemann zeta function ζ (n), studied for centuries across mathematics and physics. Matrix fields also enable new mathematical proofs of irrationality. For example, we use them to generalize the celebrated proof by Apéry of the irrationality of ζ (3). Utilizing thousands of personal computers worldwide, our research strategy demonstrates the power of large-scale computational approaches to tackle longstanding open problems and discover unexpected connections across diverse fields of science.

4.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39293803

RESUMEN

As more and more protein structures are discovered, blind protein-ligand docking will play an important role in drug discovery because it can predict protein-ligand complex conformation without pocket information on the target proteins. Recently, deep learning-based methods have made significant advancements in blind protein-ligand docking, but their protein features are suboptimal because they do not fully consider the difference between potential pocket regions and non-pocket regions in protein feature extraction. In this work, we propose a pocket-guided strategy for guiding the ligand to dock to potential docking regions on a protein. To this end, we design a plug-and-play module to enhance the protein features, which can be directly incorporated into existing deep learning-based blind docking methods. The proposed module first estimates potential pocket regions on the target protein and then leverages a pocket-guided attention mechanism to enhance the protein features. Experiments are conducted on integrating our method with EquiBind and FABind, and the results show that their blind-docking performances are both significantly improved and new start-of-the-art performance is achieved by integration with FABind.


Asunto(s)
Descubrimiento de Drogas , Ligandos , Proteínas , Algoritmos , Sitios de Unión , Biología Computacional/métodos , Aprendizaje Profundo , Simulación del Acoplamiento Molecular , Unión Proteica , Conformación Proteica , Proteínas/química , Proteínas/metabolismo
5.
Adv Mater ; 36(6): e2306733, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37813548

RESUMEN

Combining materials science, artificial intelligence (AI), physical chemistry, and other disciplines, materials informatics is continuously accelerating the vigorous development of new materials. The emergence of "GPT (Generative Pre-trained Transformer) AI" shows that the scientific research field has entered the era of intelligent civilization with "data" as the basic factor and "algorithm + computing power" as the core productivity. The continuous innovation of AI will impact the cognitive laws and scientific methods, and reconstruct the knowledge and wisdom system. This leads to think more about materials informatics. Here, a comprehensive discussion of AI models and materials infrastructures is provided, and the advances in the discovery and design of new materials are reviewed. With the rise of new research paradigms triggered by "AI for Science", the vane of materials informatics: "MatGPT", is proposed and the technical path planning from the aspects of data, descriptors, generative models, pretraining models, directed design models, collaborative training, experimental robots, as well as the efforts and preparations needed to develop a new generation of materials informatics, is carried out. Finally, the challenges and constraints faced by materials informatics are discussed, in order to achieve a more digital, intelligent, and automated construction of materials informatics with the joint efforts of more interdisciplinary scientists.

6.
R Soc Open Sci ; 11(8): 231130, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39169971

RESUMEN

Aspirations for artificial intelligence (AI) as a catalyst for scientific discovery are growing. High-profile successes deploying AI in domains such as protein folding have highlighted AI's potential to unlock new frontiers of scientific knowledge. However, the pathway from AI innovation to deployment in research is not linear. Those seeking to drive a new wave of scientific progress through the application of AI require a diffusion engine that can enhance AI adoption across disciplines. Lessons from previous waves of technology change, experiences of deploying AI in real-world contexts and an emerging research agenda from the AI for science community suggest a framework for accelerating AI adoption. This framework requires action to build supply chains of ideas between disciplines; rapidly transfer technological capabilities through open research; create AI tools that empower researchers; and embed effective data stewardship. Together, these interventions can cultivate an environment of open data science that deliver the benefits of AI across the sciences.

7.
ACS Nano ; 18(40): 27138-27166, 2024 Oct 08.
Artículo en Inglés | MEDLINE | ID: mdl-39316700

RESUMEN

Atomically precise metal nanoclusters (MNCs) represent a fascinating class of ultrasmall nanoparticles with molecule-like properties, bridging conventional metal-ligand complexes and nanocrystals. Despite their potential for various applications, synthesis challenges such as a precise understanding of varied synthetic parameters and property-driven synthesis persist, hindering their full exploitation and wider application. Incorporating smart synthesis methodologies, including a closed-loop framework of automation, data interpretation, and feedback from AI, offers promising solutions to address these challenges. In this perspective, we summarize the closed-loop smart synthesis that has been demonstrated in various nanomaterials and explore the research frontiers of smart synthesis for MNCs. Moreover, the perspectives on the inherent challenges and opportunities of smart synthesis for MNCs are discussed, aiming to provide insights and directions for future advancements in this emerging field of AI for Science, while the integration of deep learning algorithms stands to substantially enrich research in smart synthesis by offering enhanced predictive capabilities, optimization strategies, and control mechanisms, thereby extending the potential of MNC synthesis.

8.
Front Mol Biosci ; 11: 1393564, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39044842

RESUMEN

Molecules are essential building blocks of life and their different conformations (i.e., shapes) crucially determine the functional role that they play in living organisms. Cryogenic Electron Microscopy (cryo-EM) allows for acquisition of large image datasets of individual molecules. Recent advances in computational cryo-EM have made it possible to learn latent variable models of conformation landscapes. However, interpreting these latent spaces remains a challenge as their individual dimensions are often arbitrary. The key message of our work is that this interpretation challenge can be viewed as an Independent Component Analysis (ICA) problem where we seek models that have the property of identifiability. That means, they have an essentially unique solution, representing a conformational latent space that separates the different degrees of freedom a molecule is equipped with in nature. Thus, we aim to advance the computational field of cryo-EM beyond visualizations as we connect it with the theoretical framework of (nonlinear) ICA and discuss the need for identifiable models, improved metrics, and benchmarks. Moving forward, we propose future directions for enhancing the disentanglement of latent spaces in cryo-EM, refining evaluation metrics and exploring techniques that leverage physics-based decoders of biomolecular systems. Moreover, we discuss how future technological developments in time-resolved single particle imaging may enable the application of nonlinear ICA models that can discover the true conformation changes of molecules in nature. The pursuit of interpretable conformational latent spaces will empower researchers to unravel complex biological processes and facilitate targeted interventions. This has significant implications for drug discovery and structural biology more broadly. More generally, latent variable models are deployed widely across many scientific disciplines. Thus, the argument we present in this work has much broader applications in AI for science if we want to move from impressive nonlinear neural network models to mathematically grounded methods that can help us learn something new about nature.

9.
Patterns (N Y) ; 5(5): 100955, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38800367

RESUMEN

Materials scientists usually collect experimental data to summarize experiences and predict improved materials. However, a crucial issue is how to proficiently utilize unstructured data to update existing structured data, particularly in applied disciplines. This study introduces a new natural language processing (NLP) task called structured information inference (SII) to address this problem. We propose an end-to-end approach to summarize and organize the multi-layered device-level information from the literature into structured data. After comparing different methods, we fine-tuned LLaMA with an F1 score of 87.14% to update an existing perovskite solar cell dataset with articles published since its release, allowing its direct use in subsequent data analysis. Using structured information, we developed regression tasks to predict the electrical performance of solar cells. Our results demonstrate comparable performance to traditional machine-learning methods without feature selection and highlight the potential of large language models for scientific knowledge acquisition and material development.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA