Your browser doesn't support javascript.
loading
Machine Learning Strategies for Improved Phenotype Prediction in Underrepresented Populations.
Bonet, David; Levin, May; Montserrat, Daniel Mas; Ioannidis, Alexander G.
Afiliação
  • Bonet D; Stanford University, Stanford, CA, US.
  • Levin M; Universitat Politècnica de Catalunya, Barcelona, Spain.
  • Montserrat DM; Stanford University, Stanford, CA, US.
  • Ioannidis AG; Stanford University, Stanford, CA, US.
bioRxiv ; 2023 Oct 17.
Article em En | MEDLINE | ID: mdl-37904983
ABSTRACT
Precision medicine models often perform better for populations of European ancestry due to the over-representation of this group in the genomic datasets and large-scale biobanks from which the models are constructed. As a result, prediction models may misrepresent or provide less accurate treatment recommendations for underrepresented populations, contributing to health disparities. This study introduces an adaptable machine learning toolkit that integrates multiple existing methodologies and novel techniques to enhance the prediction accuracy for underrepresented populations in genomic datasets. By leveraging machine learning techniques, including gradient boosting and automated methods, coupled with novel population-conditional re-sampling techniques, our method significantly improves the phenotypic prediction from single nucleotide polymorphism (SNP) data for diverse populations. We evaluate our approach using the UK Biobank, which is composed primarily of British individuals with European ancestry, and a minority representation of groups with Asian and African ancestry. Performance metrics demonstrate substantial improvements in phenotype prediction for underrepresented groups, achieving prediction accuracy comparable to that of the majority group. This approach represents a significant step towards improving prediction accuracy amidst current dataset diversity challenges. By integrating a tailored pipeline, our approach fosters more equitable validity and utility of statistical genetics methods, paving the way for more inclusive models and outcomes.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Contexto em Saúde: 1_ASSA2030 Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Contexto em Saúde: 1_ASSA2030 Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2023 Tipo de documento: Article