Pesquisa | Secretaria de Estado da Saúde

CryoSegNet: accurate cryo-EM protein particle picking by integrating the foundational AI image segmentation model and attention-gated U-Net.

Gyawali, Rajan; Dhakal, Ashwin; Wang, Liguo; Cheng, Jianlin.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38860738

RESUMO

Picking protein particles in cryo-electron microscopy (cryo-EM) micrographs is a crucial step in the cryo-EM-based structure determination. However, existing methods trained on a limited amount of cryo-EM data still cannot accurately pick protein particles from noisy cryo-EM images. The general foundational artificial intelligence-based image segmentation model such as Meta's Segment Anything Model (SAM) cannot segment protein particles well because their training data do not include cryo-EM images. Here, we present a novel approach (CryoSegNet) of integrating an attention-gated U-shape network (U-Net) specially designed and trained for cryo-EM particle picking and the SAM. The U-Net is first trained on a large cryo-EM image dataset and then used to generate input from original cryo-EM images for SAM to make particle pickings. CryoSegNet shows both high precision and recall in segmenting protein particles from cryo-EM micrographs, irrespective of protein type, shape and size. On several independent datasets of various protein types, CryoSegNet outperforms two top machine learning particle pickers crYOLO and Topaz as well as SAM itself. The average resolution of density maps reconstructed from the particles picked by CryoSegNet is 3.33 Å, 7% better than 3.58 Å of Topaz and 14% better than 3.87 Å of crYOLO. It is publicly available at https://github.com/jianlin-cheng/CryoSegNet.

Assuntos

Microscopia Crioeletrônica , Processamento de Imagem Assistida por Computador , Microscopia Crioeletrônica/métodos , Processamento de Imagem Assistida por Computador/métodos , Proteínas/química , Inteligência Artificial , Algoritmos , Bases de Dados de Proteínas

CryoTransformer: a transformer model for picking protein particles from cryo-EM micrographs.

Dhakal, Ashwin; Gyawali, Rajan; Wang, Liguo; Cheng, Jianlin.

Bioinformatics ; 40(3)2024 Mar 04.

Artigo em Inglês | MEDLINE | ID: mdl-38407301

RESUMO

MOTIVATION: Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise ratio. RESULTS: To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labeled cryo-EM protein particle dataset-CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score, and is poised to facilitate the automation of the cryo-EM protein particle picking. AVAILABILITY AND IMPLEMENTATION: The source code and data for CryoTransformer are openly available at: https://github.com/jianlin-cheng/CryoTransformer.

Assuntos

Inteligência Artificial , Software , Microscopia Crioeletrônica/métodos , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodos , Proteínas

Artificial intelligence in the prediction of protein-ligand interactions: recent advances and future directions.

Dhakal, Ashwin; McKay, Cole; Tanner, John J; Cheng, Jianlin.

Brief Bioinform ; 23(1)2022 01 17.

Artigo em Inglês | MEDLINE | ID: mdl-34849575

RESUMO

New drug production, from target identification to marketing approval, takes over 12 years and can cost around $2.6 billion. Furthermore, the COVID-19 pandemic has unveiled the urgent need for more powerful computational methods for drug discovery. Here, we review the computational approaches to predicting protein-ligand interactions in the context of drug discovery, focusing on methods using artificial intelligence (AI). We begin with a brief introduction to proteins (targets), ligands (e.g. drugs) and their interactions for nonexperts. Next, we review databases that are commonly used in the domain of protein-ligand interactions. Finally, we survey and analyze the machine learning (ML) approaches implemented to predict protein-ligand binding sites, ligand-binding affinity and binding pose (conformation) including both classical ML algorithms and recent deep learning methods. After exploring the correlation between these three aspects of protein-ligand interaction, it has been proposed that they should be studied in unison. We anticipate that our review will aid exploration and development of more accurate ML-based prediction strategies for studying protein-ligand interactions.

Assuntos

Antivirais , Tratamento Farmacológico da COVID-19 , COVID-19 , Aprendizado Profundo , Descoberta de Drogas , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Antivirais/química , Antivirais/farmacocinética , COVID-19/metabolismo , Humanos , Ligantes

Accurate cryo-EM protein particle picking by integrating the foundational AI image segmentation model and specialized U-Net.

Gyawali, Rajan; Dhakal, Ashwin; Wang, Liguo; Cheng, Jianlin.

bioRxiv ; 2024 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-37873264

RESUMO

Picking protein particles in cryo-electron microscopy (cryo-EM) micrographs is a crucial step in the cryo-EM-based structure determination. However, existing methods trained on a limited amount of cryo-EM data still cannot accurately pick protein particles from noisy cryo-EM images. The general foundational artificial intelligence (AI)-based image segmentation model such as Meta's Segment Anything Model (SAM) cannot segment protein particles well because their training data do not include cryo-EM images. Here, we present a novel approach (CryoSegNet) of integrating an attention-gated U-shape network (U-Net) specially designed and trained for cryo-EM particle picking and the SAM. The U-Net is first trained on a large cryo-EM image dataset and then used to generate input from original cryo-EM images for SAM to make particle pickings. CryoSegNet shows both high precision and recall in segmenting protein particles from cryo-EM micrographs, irrespective of protein type, shape, and size. On several independent datasets of various protein types, CryoSegNet outperforms two top machine learning particle pickers crYOLO and Topaz as well as SAM itself. The average resolution of density maps reconstructed from the particles picked by CryoSegNet is 3.32 Å, 7% better than 3.57 Å of Topaz and 14% better than 3.85 Å of crYOLO.

CryoPPP: A Large Expert-Labelled Cryo-EM Image Dataset for Machine Learning Protein Particle Picking.

Dhakal, Ashwin; Gyawali, Rajan; Wang, Liguo; Cheng, Jianlin.

bioRxiv ; 2023 Feb 22.

Artigo em Inglês | MEDLINE | ID: mdl-36865277

RESUMO

Cryo-electron microscopy (cryo-EM) is currently the most powerful technique for determining the structures of large protein complexes and assemblies. Picking single-protein particles from cryo-EM micrographs (images) is a key step in reconstructing protein structures. However, the widely used template-based particle picking process is labor-intensive and time-consuming. Though the emerging machine learning-based particle picking can potentially automate the process, its development is severely hindered by lack of large, high-quality, manually labelled training data. Here, we present CryoPPP, a large, diverse, expert-curated cryo-EM image dataset for single protein particle picking and analysis to address this bottleneck. It consists of manually labelled cryo-EM micrographs of 32 non-redundant, representative protein datasets selected from the Electron Microscopy Public Image Archive (EMPIAR). It includes 9,089 diverse, high-resolution micrographs (âË»300 cryo-EM images per EMPIAR dataset) in which the coordinates of protein particles were labelled by human experts. The protein particle labelling process was rigorously validated by both 2D particle class validation and 3D density map validation with the gold standard. The dataset is expected to greatly facilitate the development of machine learning and artificial intelligence methods for automated cryo-EM protein particle picking. The dataset and data processing scripts are available at https://github.com/BioinfoMachineLearning/cryoppp.

A large expert-curated cryo-EM image dataset for machine learning protein particle picking.

Dhakal, Ashwin; Gyawali, Rajan; Wang, Liguo; Cheng, Jianlin.

Sci Data ; 10(1): 392, 2023 06 22.

Artigo em Inglês | MEDLINE | ID: mdl-37349345

RESUMO

Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of biological macromolecular complexes. Picking single-protein particles from cryo-EM micrographs is a crucial step in reconstructing protein structures. However, the widely used template-based particle picking process is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) based particle picking can potentially automate the process, its development is hindered by lack of large, high-quality labelled training data. To address this bottleneck, we present CryoPPP, a large, diverse, expert-curated cryo-EM image dataset for protein particle picking and analysis. It consists of labelled cryo-EM micrographs (images) of 34 representative protein datasets selected from the Electron Microscopy Public Image Archive (EMPIAR). The dataset is 2.6 terabytes and includes 9,893 high-resolution micrographs with labelled protein particle coordinates. The labelling process was rigorously validated through 2D particle class validation and 3D density map validation with the gold standard. The dataset is expected to greatly facilitate the development of both AI and classical methods for automated cryo-EM protein particle picking.

Assuntos

Aprendizado de Máquina , Proteínas , Inteligência Artificial , Microscopia Crioeletrônica , Humanos , Animais

CryoTransformer: A Transformer Model for Picking Protein Particles from Cryo-EM Micrographs.

Dhakal, Ashwin; Gyawali, Rajan; Wang, Liguo; Cheng, Jianlin.

bioRxiv ; 2023 Oct 23.

Artigo em Inglês | MEDLINE | ID: mdl-37961171

RESUMO

Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise (SNR) ratios. To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labelled cryo-EM protein particle dataset - CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score and is poised to facilitate the automation of the cryo-EM protein particle picking.

CryoVirusDB: A Labeled Cryo-EM Image Dataset for AI-Driven Virus Particle Picking.

Gyawali, Rajan; Dhakal, Ashwin; Wang, Liguo; Cheng, Jianlin.

bioRxiv ; 2023 Dec 26.

Artigo em Inglês | MEDLINE | ID: mdl-38234823

RESUMO

With the advancements in instrumentation, image processing algorithms, and computational capabilities, single-particle electron cryo-microscopy (cryo-EM) has achieved nearly atomic resolution in determining the 3D structures of viruses. The virus structures play a crucial role in studying their biological function and advancing the development of antiviral vaccines and treatments. Despite the effectiveness of artificial intelligence (AI) in general image processing, its development for identifying and extracting virus particles from cryo-EM micrographs (images) has been hindered by the lack of manually labelled high-quality datasets. To fill the gap, we introduce CryoVirusDB, a labeled dataset containing the coordinates of expert-picked virus particles in cryo-EM micrographs. CryoVirusDB comprises 9,941 micrographs of 9 different viruses along with the coordinates of 339,398 labeled virus particles. It can be used to train and test AI and machine learning (e.g., deep learning) methods to accurately identify virus particles in cryo-EM micrographs for building atomic 3D structural models for viruses.

Deep Learning Prediction of Severe Health Risks for Pediatric COVID-19 Patients with a Large Feature Set in 2021 BARDA Data Challenge.

Mahmud, Sajid; Soltanikazemi, Elham; Boadu, Frimpong; Dhakal, Ashwin; Cheng, Jianlin.

ArXiv ; 2022 Jun 03.

Artigo em Inglês | MEDLINE | ID: mdl-35677422

RESUMO

Most children infected with COVID-19 have no or mild symptoms and can recover automatically by themselves, but some pediatric COVID-19 patients need to be hospitalized or even to receive intensive medical care (e.g., invasive mechanical ventilation or cardiovascular support) to recover from the illnesses. Therefore, it is critical to predict the severe health risk that COVID-19 infection poses to children to provide precise and timely medical care for vulnerable pediatric COVID-19 patients. However, predicting the severe health risk for COVID-19 patients including children remains a significant challenge because many underlying medical factors affecting the risk are still largely unknown. In this work, instead of searching for a small number of most useful features to make prediction, we design a novel large-scale bag-of-words like method to represent various medical conditions and measurements of COVID-19 patients. After some simple feature filtering based on logistical regression, the large set of features is used with a deep learning method to predict both the hospitalization risk for COVID-19 infected children and the severe complication risk for the hospitalized pediatric COVID-19 patients. The method was trained and tested the datasets of the Biomedical Advanced Research and Development Authority (BARDA) Pediatric COVID-19 Data Challenge held from Sept. 15 to Dec. 17, 2021. The results show that the approach can rather accurately predict the risk of hospitalization and severe complication for pediatric COVID-19 patients and deep learning is more accurate than other machine learning methods.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa