RESUMO
STUDY QUESTION: Can an artificial intelligence (AI) model predict human embryo ploidy status using static images captured by optical light microscopy? SUMMARY ANSWER: Results demonstrated predictive accuracy for embryo euploidy and showed a significant correlation between AI score and euploidy rate, based on assessment of images of blastocysts at Day 5 after IVF. WHAT IS KNOWN ALREADY: Euploid embryos displaying the normal human chromosomal complement of 46 chromosomes are preferentially selected for transfer over aneuploid embryos (abnormal complement), as they are associated with improved clinical outcomes. Currently, evaluation of embryo genetic status is most commonly performed by preimplantation genetic testing for aneuploidy (PGT-A), which involves embryo biopsy and genetic testing. The potential for embryo damage during biopsy, and the non-uniform nature of aneuploid cells in mosaic embryos, has prompted investigation of additional, non-invasive, whole embryo methods for evaluation of embryo genetic status. STUDY DESIGN, SIZE, DURATION: A total of 15 192 blastocyst-stage embryo images with associated clinical outcomes were provided by 10 different IVF clinics in the USA, India, Spain and Malaysia. The majority of data were retrospective, with two additional prospectively collected blind datasets provided by IVF clinics using the genetics AI model in clinical practice. Of these images, a total of 5050 images of embryos on Day 5 of in vitro culture were used for the development of the AI model. These Day 5 images were provided for 2438 consecutively treated women who had undergone IVF procedures in the USA between 2011 and 2020. The remaining images were used for evaluation of performance in different settings, or otherwise excluded for not matching the inclusion criteria. PARTICIPANTS/MATERIALS, SETTING, METHODS: The genetics AI model was trained using static 2-dimensional optical light microscope images of Day 5 blastocysts with linked genetic metadata obtained from PGT-A. The endpoint was ploidy status (euploid or aneuploid) based on PGT-A results. Predictive accuracy was determined by evaluating sensitivity (correct prediction of euploid), specificity (correct prediction of aneuploid) and overall accuracy. The Matthew correlation coefficient and receiver-operating characteristic curves and precision-recall curves (including AUC values), were also determined. Performance was also evaluated using correlation analyses and simulated cohort studies to evaluate ranking ability for euploid enrichment. MAIN RESULTS AND THE ROLE OF CHANCE: Overall accuracy for the prediction of euploidy on a blind test dataset was 65.3%, with a sensitivity of 74.6%. When the blind test dataset was cleansed of poor quality and mislabeled images, overall accuracy increased to 77.4%. This performance may be relevant to clinical situations where confounding factors, such as variability in PGT-A testing, have been accounted for. There was a significant positive correlation between AI score and the proportion of euploid embryos, with very high scoring embryos (9.0-10.0) twice as likely to be euploid than the lowest-scoring embryos (0.0-2.4). When using the genetics AI model to rank embryos in a cohort, the probability of the top-ranked embryo being euploid was 82.4%, which was 26.4% more effective than using random ranking, and â¼13-19% more effective than using the Gardner score. The probability increased to 97.0% when considering the likelihood of one of the top two ranked embryos being euploid, and the probability of both top two ranked embryos being euploid was 66.4%. Additional analyses showed that the AI model generalized well to different patient demographics and could also be used for the evaluation of Day 6 embryos and for images taken using multiple time-lapse systems. Results suggested that the AI model could potentially be used to differentiate mosaic embryos based on the level of mosaicism. LIMITATIONS, REASONS FOR CAUTION: While the current investigation was performed using both retrospectively and prospectively collected data, it will be important to continue to evaluate real-world use of the genetics AI model. The endpoint described was euploidy based on the clinical outcome of PGT-A results only, so predictive accuracy for genetic status in utero or at birth was not evaluated. Rebiopsy studies of embryos using a range of PGT-A methods indicated a degree of variability in PGT-A results, which must be considered when interpreting the performance of the AI model. WIDER IMPLICATIONS OF THE FINDINGS: These findings collectively support the use of this genetics AI model for the evaluation of embryo ploidy status in a clinical setting. Results can be used to aid in prioritizing and enriching for embryos that are likely to be euploid for multiple clinical purposes, including selection for transfer in the absence of alternative genetic testing methods, selection for cryopreservation for future use or selection for further confirmatory PGT-A testing, as required. STUDY FUNDING/COMPETING INTEREST(S): Life Whisperer Diagnostics is a wholly owned subsidiary of the parent company, Presagen Holdings Pty Ltd. Funding for the study was provided by Presagen with grant funding received from the South Australian Government: Research, Commercialisation, and Startup Fund (RCSF). 'In kind' support and embryology expertise to guide algorithm development were provided by Ovation Fertility. 'In kind' support in terms of computational resources provided through the Amazon Web Services (AWS) Activate Program. J.M.M.H., D.P. and M.P. are co-owners of Life Whisperer and Presagen. S.M.D., M.A.D. and T.V.N. are employees or former employees of Life Whisperer. S.M.D, J.M.M.H, M.A.D, T.V.N., D.P. and M.P. are listed as inventors of patents relating to this work, and also have stock options in the parent company Presagen. M.V. sits on the advisory board for the global distributor of the technology described in this study and also received support for attending meetings. TRIAL REGISTRATION NUMBER: N/A.
Assuntos
Diagnóstico Pré-Implantação , Aneuploidia , Inteligência Artificial , Austrália , Blastocisto/patologia , Feminino , Fertilização in vitro/métodos , Humanos , Gravidez , Diagnóstico Pré-Implantação/métodos , Probabilidade , Estudos RetrospectivosRESUMO
Recent publicized events of cryogenic storage tank failures have created nationwide concern among infertility patients and patients storing embryos and gametes for future use. To assure patient confidence, quality management (QM) plans applied by in vitro fertilization (IVF) laboratories need to include a more comprehensive focus on the cryostorage of reproductive specimens. The purpose of this review is to provide best practice guidelines for the cryogenic storage of sperm, oocytes, embryos, and other reproductive tissues (e.g., testicular and ovarian tissue, cord blood cells, and stem cells) and recommend a strategy of thorough and appropriate quality and risk management procedures aimed to alleviate or minimize the consequences from catastrophic events.
Assuntos
Criopreservação/métodos , Guias de Prática Clínica como Assunto/normas , Garantia da Qualidade dos Cuidados de Saúde/normas , Técnicas de Reprodução Assistida/normas , Bancos de Tecidos/normas , HumanosRESUMO
Medical datasets inherently contain errors from subjective or inaccurate test results, or from confounding biological complexities. It is difficult for medical experts to detect these elusive errors manually, due to lack of contextual information, limiting data privacy regulations, and the sheer scale of data to be reviewed. Current methods for training robust artificial intelligence (AI) models on data containing mislabeled examples generally fall into one of several categories-attempting to improve the robustness of the model architecture, the regularization techniques used, the loss function used during training, or selecting a subset of data that contains cleaner labels. This last category requires the ability to efficiently detect errors either prior to or during training, either relabeling them or removing them completely. More recent progress in error detection has focused on using multi-network learning to minimize deleterious effects of errors on training, however, using many neural networks to reach a consensus on which data should be removed can be computationally intensive and inefficient. In this work, a deep-learning based algorithm was used in conjunction with a label-clustering approach to automate error detection. For dataset with synthetic label flips added, these errors were identified with an accuracy of up to 85%, while requiring up to 93% less computing resources to complete compared to a previous model consensus approach developed previously. The resulting trained AI models exhibited greater training stability and up to a 45% improvement in accuracy, from 69 to over 99% compared to the consensus approach, at least 10% improvement on using noise-robust loss functions in a binary classification problem, and a 51% improvement for multi-class classification. These results indicate that practical, automated a priori detection of errors in medical data is possible, without human oversight.
Assuntos
Inteligência Artificial , Aprendizado Profundo , Humanos , Algoritmos , Análise por Conglomerados , ConsensoRESUMO
Training on multiple diverse data sources is critical to ensure unbiased and generalizable AI. In healthcare, data privacy laws prohibit data from being moved outside the country of origin, preventing global medical datasets being centralized for AI training. Data-centric, cross-silo federated learning represents a pathway forward for training on distributed medical datasets. Existing approaches typically require updates to a training model to be transferred to a central server, potentially breaching data privacy laws unless the updates are sufficiently disguised or abstracted to prevent reconstruction of the dataset. Here we present a completely decentralized federated learning approach, using knowledge distillation, ensuring data privacy and protection. Each node operates independently without needing to access external data. AI accuracy using this approach is found to be comparable to centralized training, and when nodes comprise poor-quality data, which is common in healthcare, AI accuracy can exceed the performance of traditional centralized training.