Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Genome Res ; 33(10): 1734-1746, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37879860

RESUMEN

Although it is ubiquitous in genomics, the current human reference genome (GRCh38) is incomplete: It is missing large sections of heterochromatic sequence, and as a singular, linear reference genome, it does not represent the full spectrum of human genetic diversity. To characterize gaps in GRCh38 and human genetic diversity, we developed an algorithm for sequence location approximation using nuclear families (ASLAN) to identify the region of origin of reads that do not align to GRCh38. Using unmapped reads and variant calls from whole-genome sequences (WGSs), ASLAN uses a maximum likelihood model to identify the most likely region of the genome that a subsequence belongs to given the distribution of the subsequence in the unmapped reads and phasings of families. Validating ASLAN on synthetic data and on reads from the alternative haplotypes in the decoy genome, ASLAN localizes >90% of 100-bp sequences with >92% accuracy and ∼1 Mb of resolution. We then ran ASLAN on 100-mers from unmapped reads from WGS from more than 700 families, and compared ASLAN localizations to alignment of the 100-mers to the recently released T2T-CHM13 assembly. We found that many unmapped reads in GRCh38 originate from telomeres and centromeres that are gaps in GRCh38. ASLAN localizations are in high concordance with T2T-CHM13 alignments, except in the centromeres of the acrocentric chromosomes. Comparing ASLAN localizations and T2T-CHM13 alignments, we identified sequences missing from T2T-CHM13 or sequences with high divergence from their aligned region in T2T-CHM13, highlighting new hotspots for genetic diversity.


Asunto(s)
Genoma Humano , Genómica , Humanos , Algoritmos , Telómero/genética , Variación Genética , Análisis de Secuencia de ADN
2.
Reproduction ; 168(6)2024 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-39269213

RESUMEN

In brief: We describe a first-of-its-kind audit of LGBTQ+ inclusivity in fertility care providers across the United Kingdom. Despite efforts being made to improve LGBTQ+ inclusion in fertility care, our results paint a picture of widespread gaps in clinical and cultural expertise alongside significant barriers to LGBTQ+ inclusion. Abstract: LGBTQ+ patients comprise one of the fastest-growing user demographics in fertility care, yet they remain under-represented in fertility research, practice, and discourse. Existing studies have revealed significant systemic barriers, including cisheteronormativity, discrimination, and gaps in clinical expertise. In this article, we present a checklist of measures that clinics can take to improve LGBTQ+ inclusion in fertility care, co-created with members of the LGBTQ+ community. This checklist focuses on three key areas: cultural competence, clinical considerations, and online presence. The cultural competence criteria encompass inclusive communication practices, a broad understanding of LGBTQ+ healthcare needs, and knowledge of treatment options suitable for LGBTQ+ individuals. Clinical considerations include awareness of alternative examination and gamete collection techniques for transgender and gender diverse patients, the existence of specific clinical pathways for LGBTQ+ patients, and sensitivity to the psychological aspects of fertility care unique to this demographic. The online presence criteria evaluate provider websites for the use of inclusive language and the availability of LGBTQ+-relevant information. The checklist was used as the foundation for an audit of fertility care providers across the UK in early 2024. Our audit identified a widespread lack of LGBTQ+ inclusion, particularly for transgender and gender diverse patients, highlighting deficiencies in clinical knowledge and cultural competence. Our work calls attention to the need for further efforts to understand the barriers to inclusive and competent LGBTQ+ fertility care from both healthcare provider and patient perspectives.


Asunto(s)
Salud Reproductiva , Minorías Sexuales y de Género , Humanos , Reino Unido , Femenino , Masculino , Personal de Salud/psicología , Personas Transgénero/psicología , Competencia Cultural
3.
Reprod Biomed Online ; 48(3): 103654, 2024 03.
Artículo en Inglés | MEDLINE | ID: mdl-38246064

RESUMEN

RESEARCH QUESTION: What can three-dimensional cell contact networks tell us about the developmental potential of cleavage-stage human embryos? DESIGN: This pilot study was a retrospective analysis of two Embryoscope imaging datasets from two clinics. An artificial intelligence system was used to reconstruct the three-dimensional structure of embryos from 11-plane focal stacks. Networks of cell contacts were extracted from the resulting embryo three-dimensional models and each embryo's mean contacts per cell was computed. Unpaired t-tests and receiver operating characteristic curve analysis were used to statistically analyse mean cell contact outcomes. Cell contact networks from different embryos were compared with identical embryos with similar cell arrangements. RESULTS: At t4, a higher mean number of contacts per cell was associated with greater rates of blastulation and blastocyst quality. No associations were found with biochemical pregnancy, live birth, miscarriage or ploidy. At t8, a higher mean number of contacts was associated with increased blastocyst quality, biochemical pregnancy and live birth. No associations were found with miscarriage or aneuploidy. Mean contacts at t4 weakly correlated with those at t8. Four-cell embryos fell into nine distinct cell arrangements; the five most common accounted for 97% of embryos. Eight-cell embryos, however, displayed a greater degree of variation with 59 distinct cell arrangements. CONCLUSIONS: Evidence is provided for the clinical relevance of cleavage-stage cell arrangement in the human preimplantation embryo beyond the four-cell stage, which may improve selection techniques for day-3 transfers. This pilot study provides a strong case for further investigation into spatial biomarkers and three-dimensional morphokinetics.


Asunto(s)
Aborto Espontáneo , Embarazo , Femenino , Humanos , Estudios Retrospectivos , Transferencia de Embrión/métodos , Inteligencia Artificial , Proyectos Piloto , Fase de Segmentación del Huevo , Blastocisto , Aneuploidia , Biomarcadores , Índice de Embarazo
4.
Hum Reprod ; 38(10): 1918-1926, 2023 10 03.
Artículo en Inglés | MEDLINE | ID: mdl-37581894

RESUMEN

STUDY QUESTION: Can machine learning predict the number of oocytes retrieved from controlled ovarian hyperstimulation (COH)? SUMMARY ANSWER: Three machine-learning models were successfully trained to predict the number of oocytes retrieved from COH. WHAT IS KNOWN ALREADY: A number of previous studies have identified and built predictive models on factors that influence the number of oocytes retrieved during COH. Many of these studies are, however, limited in the fact that they only consider a small number of variables in isolation. STUDY DESIGN, SIZE, DURATION: This study was a retrospective analysis of a dataset of 11,286 cycles performed at a single centre in France between 2009 and 2020 with the aim of building a predictive model for the number of oocytes retrieved from ovarian stimulation. The analysis was carried out by a data analysis team external to the centre using the Substra framework. The Substra framework enabled the data analysis team to send computer code to run securely on the centre's on-premises server. In this way, a high level of data security was achieved as the data analysis team did not have direct access to the data, nor did the data leave the centre at any point during the study. PARTICIPANTS/MATERIALS, SETTING, METHODS: The Light Gradient Boosting Machine algorithm was used to produce three predictive models: one that directly predicted the number of oocytes retrieved and two that predicted which of a set of bins provided by two clinicians the number of oocytes retrieved fell into. The resulting models were evaluated on a held-out test set and compared to linear and logistic regression baselines. In addition, the models themselves were analysed to identify the parameters that had the biggest impact on their predictions. MAIN RESULTS AND THE ROLE OF CHANCE: On average, the model that directly predicted the number of oocytes retrieved deviated from the ground truth by 4.21 oocytes. The model that predicted the first clinician's bins deviated by 0.73 bins whereas the model for the second clinician deviated by 0.62 bins. For all models, performance was best within the first and third quartiles of the target variable, with the model underpredicting extreme values of the target variable (no oocytes and large numbers of oocytes retrieved). Nevertheless, the erroneous predictions made for these extreme cases were still within the vicinity of the true value. Overall, all three models agreed on the importance of each feature which was estimated using Shapley Additive Explanation (SHAP) values. The feature with the highest mean absolute SHAP value (and thus the highest importance) was the antral follicle count, followed by basal AMH and FSH. Of the other hormonal features, basal TSH, LH, and testosterone levels were similarly important and baseline LH was the least important. The treatment characteristic with the highest SHAP value was the initial dose of gonadotropins. LIMITATIONS, REASONS FOR CAUTION: The models produced in this study were trained on a cohort from a single centre. They should thus not be used in clinical practice until trained and evaluated on a larger cohort more representative of the general population. WIDER IMPLICATIONS OF FINDINGS: These predictive models for the number of oocytes retrieved from COH may be useful in clinical practice, assisting clinicians in optimizing COH protocols for individual patients. Our work also demonstrates the promise of using the Substra framework for allowing external researchers to provide clinically relevant insights on sensitive fertility data in a fully secure, trustworthy manner and opens a number of exciting avenues for accelerating future research. STUDY FUNDING/COMPETING INTEREST(S): This study was funded by the French Public Bank of Investment as part of the Healthchain Consortium. T.Fe., C.He., J.C., C.J., C.-A.P., and C.Hi. are employed by Apricity. C.Hi. has received consulting fees and honoraria from Vitrolife, Merck Serono, Ferring, Cooper Surgical, Dibimed, Apricity, and Fairtility and travel support from Fairtility and Vitrolife, participates on an advisory board for Merck Serono, was the founder and organizer of the AI Fertility conference, has stock in Aria Fertility, TMRW, Fairtility, Apricity, and IVF Professionals, and received free equipment from Planar in exchange for first user feedback. C.J. has received a grant from BPI. J.C. has also received a grant from BPI, is a member of the Merck AI advisory board, and is a board member of Labelia Labs. C.He has a contract for medical writing of this manuscript by CHU Nantes and has received travel support from Apricity. A.R. haș received honoraria from Ferring and Organon. T.Fe. has received a grant from BPI. TRIAL REGISTRATION NUMBER: N/A.


Asunto(s)
Tasa de Natalidad , Síndrome de Hiperestimulación Ovárica , Masculino , Femenino , Humanos , Estudios Retrospectivos , Resultado del Tratamiento , Inducción de la Ovulación/métodos , Oocitos , Fertilización In Vitro/métodos
5.
Virol J ; 19(1): 225, 2022 12 24.
Artículo en Inglés | MEDLINE | ID: mdl-36566197

RESUMEN

While hundreds of thousands of human whole genome sequences (WGS) have been collected in the effort to better understand genetic determinants of disease, these whole genome sequences have less frequently been used to study another major determinant of human health: the human virome. Using the unmapped reads from WGS of over 1000 families, we present insights into the human blood DNA virome, focusing particularly on human herpesvirus (HHV) 6A, 6B, and 7. In addition to extensively cataloguing the viruses detected in WGS of human whole blood and lymphoblastoid cell lines, we use the family structure of our dataset to show that household drives transmission of several viruses, and identify the Mendelian inheritance patterns characteristic of inherited chromsomally integrated human herpesvirus 6 (iciHHV-6). Consistent with prior studies, we find that 0.6% of our dataset's population has iciHHV, and we locate candidate integration sequences for these cases. We document genetic diversity within exogenous and integrated HHV species and within integration sites of HHV-6. Finally, in the first observation of its kind, we present evidence that suggests widespread de novo HHV-6B integration and HHV-7 integration and reactivation in lymphoblastoid cell lines. These findings show that the unmapped read space of WGS is a promising source of data for virology research.


Asunto(s)
Herpesvirus Humano 6 , Infecciones por Roseolovirus , Humanos , Herpesvirus Humano 6/genética , Integración Viral , Análisis de Secuencia , Línea Celular
6.
Sci Rep ; 12(1): 9863, 2022 06 14.
Artículo en Inglés | MEDLINE | ID: mdl-35701436

RESUMEN

The unmapped readspace of whole genome sequencing data tends to be large but is often ignored. We posit that it contains valuable signals of both human infection and contamination. Using unmapped and poorly aligned reads from whole genome sequences (WGS) of over 1000 families and nearly 5000 individuals, we present insights into common viral, bacterial, and computational contamination that plague whole genome sequencing studies. We present several notable results: (1) In addition to known contaminants such as Epstein-Barr virus and phiX, sequences from whole blood and lymphocyte cell lines contain many other contaminants, likely originating from storage, prep, and sequencing pipelines. (2) Sequencing plate and biological sample source of a sample strongly influence contamination profile. And, (3) Y-chromosome fragments not on the human reference genome commonly mismap to bacterial reference genomes. Both experiment-derived and computational contamination is prominent in next-generation sequencing data. Such contamination can compromise results from WGS as well as metagenomics studies, and standard protocols for identifying and removing contamination should be developed to ensure the fidelity of sequencing-based studies.


Asunto(s)
Bacteriófagos , Infecciones por Virus de Epstein-Barr , Biología Computacional , Genoma Bacteriano , Genoma Humano , Genoma Viral , Herpesvirus Humano 4/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Secuenciación Completa del Genoma
7.
Pac Symp Biocomput ; 27: 313-324, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34890159

RESUMEN

As the last decade of human genomics research begins to bear the fruit of advancements in precision medicine, it is important to ensure that genomics' improvements in human health are distributed globally and equitably. An important step to ensuring health equity is to improve the human reference genome to capture global diversity by including a wide variety of alternative haplotypes, sequences that are not currently captured on the reference genome.We present a method that localizes 100 basepair (bp) long sequences extracted from short-read sequencing that can ultimately be used to identify what regions of the human genome non-reference sequences belong to.We extract reads that don't align to the reference genome, and compute the population's distribution of 100-mers found within the unmapped reads. We use genetic data from families to identify shared genetic material between siblings and match the distribution of unmapped k-mers to these inheritance patterns to determine the the most likely genomic region of a k-mer. We perform this localization with two highly interpretable methods of artificial intelligence: a computationally tractable Hidden Markov Model coupled to a Maximum Likelihood Estimator. Using a set of alternative haplotypes with known locations on the genome, we show that our algorithm is able to localize 96% of k-mers with over 90% accuracy and less than 1Mb median resolution. As the collection of sequenced human genomes grows larger and more diverse, we hope that this method can be used to improve the human reference genome, a critical step in addressing precision medicine's diversity crisis.


Asunto(s)
Inteligencia Artificial , Genoma Humano , Biología Computacional , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
8.
JMIR Pediatr Parent ; 5(2): e35406, 2022 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-35436234

RESUMEN

BACKGROUND: Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process that requires the work of trained physicians, significant attention has been given to developing systems that automatically detect autism. We work toward this goal by analyzing audio data, as prosody abnormalities are a signal of autism, with affected children displaying speech idiosyncrasies such as echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. OBJECTIVE: We aimed to test the ability for machine learning approaches to aid in detection of autism in self-recorded speech audio captured from children with ASD and neurotypical (NT) children in their home environments. METHODS: We considered three methods to detect autism in child speech: (1) random forests trained on extracted audio features (including Mel-frequency cepstral coefficients); (2) convolutional neural networks trained on spectrograms; and (3) fine-tuned wav2vec 2.0-a state-of-the-art transformer-based speech recognition model. We trained our classifiers on our novel data set of cellphone-recorded child speech audio curated from the Guess What? mobile game, an app designed to crowdsource videos of children with ASD and NT children in a natural home environment. RESULTS: The random forest classifier achieved 70% accuracy, the fine-tuned wav2vec 2.0 model achieved 77% accuracy, and the convolutional neural network achieved 79% accuracy when classifying children's audio as either ASD or NT. We used 5-fold cross-validation to evaluate model performance. CONCLUSIONS: Our models were able to predict autism status when trained on a varied selection of home audio clips with inconsistent recording qualities, which may be more representative of real-world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA