Pesquisa | Portal Regional da BVS

Best practices to evaluate the impact of biomedical research software-metric collection beyond citations.

Afiaz, Awan; Ivanov, Andrey A; Chamberlin, John; Hanauer, David; Savonen, Candace L; Goldman, Mary J; Morgan, Martin; Reich, Michael; Getka, Alexander; Holmes, Aaron; Pati, Sarthak; Knight, Dan; Boutros, Paul C; Bakas, Spyridon; Caporaso, J Gregory; Del Fiol, Guilherme; Hochheiser, Harry; Haas, Brian; Schloss, Patrick D; Eddy, James A; Albrecht, Jake; Fedorov, Andrey; Waldron, Levi; Hoffman, Ava M; Bradshaw, Richard L; Leek, Jeffrey T; Wright, Carrie.

Bioinformatics ; 40(8)2024 Aug 02.

Artigo em Inglês | MEDLINE | ID: mdl-39067017

RESUMO

MOTIVATION: Software is vital for the advancement of biology and medicine. Impact evaluations of scientific software have primarily emphasized traditional citation metrics of associated papers, despite these metrics inadequately capturing the dynamic picture of impact and despite challenges with improper citation. RESULTS: To understand how software developers evaluate their tools, we conducted a survey of participants in the Informatics Technology for Cancer Research (ITCR) program funded by the National Cancer Institute (NCI). We found that although developers realize the value of more extensive metric collection, they find a lack of funding and time hindering. We also investigated software among this community for how often infrastructure that supports more nontraditional metrics were implemented and how this impacted rates of papers describing usage of the software. We found that infrastructure such as social media presence, more in-depth documentation, the presence of software health metrics, and clear information on how to contact developers seemed to be associated with increased mention rates. Analysing more diverse metrics can enable developers to better understand user engagement, justify continued funding, identify novel use cases, pinpoint improvement areas, and ultimately amplify their software's impact. Challenges are associated, including distorted or misleading metrics, as well as ethical and security concerns. More attention to nuances involved in capturing impact across the spectrum of biomedical software is needed. For funders and developers, we outline guidance based on experience from our community. By considering how we evaluate software, we can empower developers to create tools that more effectively accelerate biological and medical research progress. AVAILABILITY AND IMPLEMENTATION: More information about the analysis, as well as access to data and code is available at https://github.com/fhdsl/ITCR_Metrics_manuscript_website.

Assuntos

Pesquisa Biomédica , Software , Pesquisa Biomédica/métodos , Humanos , Estados Unidos , Biologia Computacional/métodos

Evaluation and optimization of sequence-based gene regulatory deep learning models.

Rafi, Abdul Muntakim; Nogina, Daria; Penzar, Dmitry; Lee, Dohoon; Lee, Danyeong; Kim, Nayeon; Kim, Sangyeup; Kim, Dohyeon; Shin, Yeojin; Kwak, Il-Youp; Meshcheryakov, Georgy; Lando, Andrey; Zinkevich, Arsenii; Kim, Byeong-Chan; Lee, Juhyun; Kang, Taein; Vaishnav, Eeshit Dhaval; Yadollahpour, Payman; Kim, Sun; Albrecht, Jake; Regev, Aviv; Gong, Wuming; Kulakovskiy, Ivan V; Meyer, Pablo; de Boer, Carl.

bioRxiv ; 2024 Feb 17.

Artigo em Inglês | MEDLINE | ID: mdl-38405704

RESUMO

Neural networks have emerged as immensely powerful tools in predicting functional genomic regions, notably evidenced by recent successes in deciphering gene regulatory logic. However, a systematic evaluation of how model architectures and training strategies impact genomics model performance is lacking. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast, to best capture the relationship between regulatory DNA and gene expression. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. While some benchmarks produced similar results across the top-performing models, others differed substantially. All top-performing models used neural networks, but diverged in architectures and novel training strategies, tailored to genomics sequence data. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide any given model into logically equivalent building blocks. We tested all possible combinations for the top three models and observed performance improvements for each. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets. Overall, we demonstrate that high-quality gold-standard genomics datasets can drive significant progress in model development.

Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research.

Golob, Jonathan L; Oskotsky, Tomiko T; Tang, Alice S; Roldan, Alennie; Chung, Verena; Ha, Connie W Y; Wong, Ronald J; Flynn, Kaitlin J; Parraga-Leo, Antonio; Wibrand, Camilla; Minot, Samuel S; Oskotsky, Boris; Andreoletti, Gaia; Kosti, Idit; Bletz, Julie; Nelson, Amber; Gao, Jifan; Wei, Zhoujingpeng; Chen, Guanhua; Tang, Zheng-Zheng; Novielli, Pierfrancesco; Romano, Donato; Pantaleo, Ester; Amoroso, Nicola; Monaco, Alfonso; Vacca, Mirco; De Angelis, Maria; Bellotti, Roberto; Tangaro, Sabina; Kuntzleman, Abigail; Bigcraft, Isaac; Techtmann, Stephen; Bae, Daehun; Kim, Eunyoung; Jeon, Jongbum; Joe, Soobok; Theis, Kevin R; Ng, Sherrianne; Lee, Yun S; Diaz-Gimeno, Patricia; Bennett, Phillip R; MacIntyre, David A; Stolovitzky, Gustavo; Lynch, Susan V; Albrecht, Jake; Gomez-Lopez, Nardhy; Romero, Roberto; Stevenson, David K; Aghaeepour, Nima; Tarca, Adi L.

Cell Rep Med ; 5(1): 101350, 2024 01 16.

Artigo em Inglês | MEDLINE | ID: mdl-38134931

RESUMO

Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; <37 weeks) or (2) early preterm birth (ePTB; <32 weeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth.

Assuntos

Crowdsourcing , Microbiota , Nascimento Prematuro , Gravidez , Feminino , Recém-Nascido , Humanos , Filogenia , Vagina , Microbiota/genética

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA