Pesquisa | BVS Violência e Saúde

MELLODDY: Cross-pharma Federated Learning at Unprecedented Scale Unlocks Benefits in QSAR without Compromising Proprietary Information.

Heyndrickx, Wouter; Mervin, Lewis; Morawietz, Tobias; Sturm, Noé; Friedrich, Lukas; Zalewski, Adam; Pentina, Anastasia; Humbeck, Lina; Oldenhof, Martijn; Niwayama, Ritsuya; Schmidtke, Peter; Fechner, Nikolas; Simm, Jaak; Arany, Adam; Drizard, Nicolas; Jabal, Rama; Afanasyeva, Arina; Loeb, Regis; Verma, Shlok; Harnqvist, Simon; Holmes, Matthew; Pejo, Balazs; Telenczuk, Maria; Holway, Nicholas; Dieckmann, Arne; Rieke, Nicola; Zumsande, Friederike; Clevert, Djork-Arné; Krug, Michael; Luscombe, Christopher; Green, Darren; Ertl, Peter; Antal, Peter; Marcus, David; Do Huu, Nicolas; Fuji, Hideyoshi; Pickett, Stephen; Acs, Gergely; Boniface, Eric; Beck, Bernd; Sun, Yax; Gohier, Arnaud; Rippmann, Friedrich; Engkvist, Ola; Göller, Andreas H; Moreau, Yves; Galtier, Mathieu N; Schuffenhauer, Ansgar; Ceulemans, Hugo.

J Chem Inf Model ; 64(7): 2331-2344, 2024 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-37642660

RESUMO

Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma data set of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate the predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point toward an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performance, albeit with a saturating return. Markedly higher improvements were observed for the pharmacokinetics and safety panel assay-based task subsets.

Assuntos

Benchmarking , Relação Quantitativa Estrutura-Atividade , Bioensaio , Aprendizado de Máquina

ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning.

Oldenhof, Martijn; Arany, Adam; Moreau, Yves; Simm, Jaak.

J Chem Inf Model ; 60(10): 4506-4517, 2020 10 26.

Artigo em Inglês | MEDLINE | ID: mdl-32924466

RESUMO

In drug discovery, knowledge of the graph structure of chemical compounds is essential. Many thousands of scientific articles and patents in chemistry and pharmaceutical sciences have investigated chemical compounds, but in many cases, the details of the structure of these chemical compounds are published only as an image. A tool to analyze these images automatically and convert them into a chemical graph structure would be useful for many applications, such as drug discovery. A few such tools are available and they are mostly derived from optical character recognition. However, our evaluation of the performance of these tools reveals that they often make mistakes in recognizing the correct bond multiplicity and stereochemical information. In addition, errors sometimes even lead to missing atoms in the resulting graph. In our work, we address these issues by developing a compound recognition method based on machine learning. More specifically, we develop a deep neural network model for optical compound recognition. The deep learning solution presented here consists of a segmentation model, followed by three classification models that predict atom locations, bonds, and charges. Furthermore, this model not only predicts the graph structure of the molecule but also provides all information necessary to relate each component of the resulting graph to the source image. This solution is scalable and can rapidly process thousands of images. Finally, we empirically compare the proposed method with the well-established tool OSRA1 and observe significant error reduction.

Assuntos

Aprendizado Profundo , Descoberta de Drogas , Aprendizado de Máquina , Redes Neurais de Computação

Accessible Ecosystem for Clinical Research (Federated Learning for Everyone): Development and Usability Study.

Pirmani, Ashkan; Oldenhof, Martijn; Peeters, Liesbet M; De Brouwer, Edward; Moreau, Yves.

JMIR Form Res ; 8: e55496, 2024 Jul 17.

Artigo em Inglês | MEDLINE | ID: mdl-39018557

RESUMO

BACKGROUND: The integrity and reliability of clinical research outcomes rely heavily on access to vast amounts of data. However, the fragmented distribution of these data across multiple institutions, along with ethical and regulatory barriers, presents significant challenges to accessing relevant data. While federated learning offers a promising solution to leverage insights from fragmented data sets, its adoption faces hurdles due to implementation complexities, scalability issues, and inclusivity challenges. OBJECTIVE: This paper introduces Federated Learning for Everyone (FL4E), an accessible framework facilitating multistakeholder collaboration in clinical research. It focuses on simplifying federated learning through an innovative ecosystem-based approach. METHODS: The "degree of federation" is a fundamental concept of FL4E, allowing for flexible integration of federated and centralized learning models. This feature provides a customizable solution by enabling users to choose the level of data decentralization based on specific health care settings or project needs, making federated learning more adaptable and efficient. By using an ecosystem-based collaborative learning strategy, FL4E encourages a comprehensive platform for managing real-world data, enhancing collaboration and knowledge sharing among its stakeholders. RESULTS: Evaluating FL4E's effectiveness using real-world health care data sets has highlighted its ecosystem-oriented and inclusive design. By applying hybrid models to 2 distinct analytical tasks-classification and survival analysis-within real-world settings, we have effectively measured the "degree of federation" across various contexts. These evaluations show that FL4E's hybrid models not only match the performance of fully federated models but also avoid the substantial overhead usually linked with these models. Achieving this balance greatly enhances collaborative initiatives and broadens the scope of analytical possibilities within the ecosystem. CONCLUSIONS: FL4E represents a significant step forward in collaborative clinical research by merging the benefits of centralized and federated learning. Its modular ecosystem-based design and the "degree of federation" feature make it an inclusive, customizable framework suitable for a wide array of clinical research scenarios, promising to revolutionize the field through improved collaboration and data use. Detailed implementation and analyses are available on the associated GitHub repository.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA