Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Methods Mol Biol ; 2432: 187-200, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35505216

RESUMEN

Recent research studies using epigenetic data have been exploring whether it is possible to estimate how old someone is using only their DNA. This application stems from the strong correlation that has been observed in humans between the methylation status of certain DNA loci and chronological age. While genome-wide methylation sequencing has been the most prominent approach in epigenetics research, recent studies have shown that targeted sequencing of a limited number of loci can be successfully used for the estimation of chronological age from DNA samples, even when using small datasets. Following this shift, the need to investigate further into the appropriate statistics behind the predictive models used for DNA methylation-based prediction has been identified in multiple studies. This chapter will look into an example of basic data manipulation and modeling that can be applied to small DNA methylation datasets (100-400 samples) produced through targeted methylation sequencing for a small number of predictors (10-25 methylation sites). Data manipulation will focus on converting the obtained methylation values for the different predictors to a statistically meaningful dataset, followed by a basic introduction into importing such datasets in R, as well as randomizing and splitting into appropriate training and test sets for modeling. Finally, a basic introduction to R modeling will be outlined, starting with feature selection algorithms and continuing with a simple modeling example (linear model) as well as a more complex algorithm (Support Vector Machine).


Asunto(s)
Metilación de ADN , Epigénesis Genética , ADN/genética , Humanos , Aprendizaje Automático , Máquina de Vectores de Soporte
2.
Forensic Sci Int Genet ; 57: 102637, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34852982

RESUMEN

The estimation of chronological age from biological fluids has been an important quest for forensic scientists worldwide, with recent approaches exploiting the variability of DNA methylation patterns with age in order to develop the next generation of forensic 'DNA intelligence' tools for this application. Drawing from the conclusions of previous work utilising massively parallel sequencing (MPS) for this analysis, this work introduces a DNA methylation-based age estimation method for blood that exhibits the best combination of prediction accuracy and sensitivity reported to date. Statistical evaluation of markers from 51 studies using microarray data from over 4000 individuals, followed by validation using in-house generated MPS data, revealed a final set of 11 markers with the greatest potential for accurate age estimation from minimal DNA material. Utilising an algorithm based on support vector machines, the proposed model achieved an average error (MAE) of 3.3 years, with this level of accuracy retained down to 5 ng of starting DNA input (~ 1 ng PCR input). The accuracy of the model was retained (MAE = 3.8 years) in a separate test set of 88 samples of Spanish origin, while predictions for donors of greater forensic interest (< 55 years of age) displayed even higher accuracy (MAE = 2.6 years). Finally, no sex-related bias was observed for this model, while there were also no signs of variation observed between control and disease-associated populations for schizophrenia, rheumatoid arthritis, frontal temporal dementia and progressive supranuclear palsy in microarray data relating to the 11 markers.


Asunto(s)
Envejecimiento , Metilación de ADN , Envejecimiento/genética , Preescolar , Islas de CpG/genética , ADN/genética , Genética Forense , Humanos , Inteligencia
3.
Forensic Sci Int Genet ; 34: e1-e6, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29506869

RESUMEN

A total of 3128 Y-STR profiles from three UK and one Irish population have been analysed with the PowerPlex Y23 system and are reported here. Instances of haplotype sharing between apparently unrelated individuals were identified and further investigated with the use of the 5 additional markers within the Yfiler Plus kit, resulting in a reduction by 76% in the number of shared haplotypes. Furthermore, Yfiler Plus was also employed to verify locus deletions and duplications observed in Y23 genotypes while inconsistencies between the two kits were sequenced, revealing underlying Y23 primer binding site mutations in loci DYS392 and DYS576. Finally, the mechanism behind a previously reported population specific peak shift observed in DYS481 in South Asian samples has been evaluated and further investigated in a novel case of this phenomenon seen in a Black British individual featuring a different flanking region mutation.


Asunto(s)
Cromosomas Humanos Y , Genética de Población , Repeticiones de Microsatélite , Deleción Cromosómica , Duplicación Cromosómica , Dermatoglifia del ADN , Genotipo , Haplotipos , Humanos , Irlanda , Grupos Raciales/genética , Reino Unido
4.
Forensic Sci Int Genet ; 37: 215-226, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30243148

RESUMEN

The field of DNA intelligence focuses on retrieving information from DNA evidence that can help narrow down large groups of suspects or define target groups of interest. With recent breakthroughs on the estimation of geographical ancestry and physical appearance, the estimation of chronological age comes to complete this circle of information. Recent studies have identified methylation sites in the human genome that correlate strongly with age and can be used for the development of age-estimation algorithms. In this study, 110 whole blood samples from individuals aged 11-93 years were analysed using a DNA methylation quantification assay based on bisulphite conversion and massively parallel sequencing (Illumina MiSeq) of 12 CpG sites. Using this data, 17 different statistical modelling approaches were compared based on root mean square error (RMSE) and a Support Vector Machine with polynomial function (SVMp) model was selected for further testing. For the selected model (RMSE = 4.9 years) the mean average error (MAE) of the blind test (n = 33) was calculated at 4.1 years, with 52% of the samples predicting with less than 4 years of error and 86% with less than 7 years. Furthermore, the sensitivity of the method was assessed both in terms of methylation quantification accuracy and prediction accuracy in the first validation of this kind. The described method retained its accuracy down to 10 ng of initial DNA input or ∼2 ng bisulphite PCR input. Finally, 34 saliva samples were analysed and following basic normalisation, the chronological age of the donors was predicted with less than 4 years of error for 50% of the samples and with less than 7 years of error for 70%.


Asunto(s)
Envejecimiento/genética , Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Análisis Químico de la Sangre , Niño , Islas de CpG/genética , Humanos , Masculino , Persona de Mediana Edad , Modelos Estadísticos , Reacción en Cadena de la Polimerasa Multiplex , Redes Neurales de la Computación , Reproducibilidad de los Resultados , Saliva/química , Semen/química , Análisis de Secuencia de ADN , Sulfitos , Máquina de Vectores de Soporte , Adulto Joven
5.
Forensic Sci Int Genet ; 37: 241-251, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30268682

RESUMEN

Human head hair shape, commonly classified as straight, wavy, curly or frizzy, is an attractive target for Forensic DNA Phenotyping and other applications of human appearance prediction from DNA such as in paleogenetics. The genetic knowledge underlying head hair shape variation was recently improved by the outcome of a series of genome-wide association and replication studies in a total of 26,964 subjects, highlighting 12 loci of which 8 were novel and introducing a prediction model for Europeans based on 14 SNPs. In the present study, we evaluated the capacity of DNA-based head hair shape prediction by investigating an extended set of candidate SNP predictors and by using an independent set of samples for model validation. Prediction model building was carried out in 9674 subjects (6068 from Europe, 2899 from Asia and 707 of admixed European and Asian ancestries), used previously, by considering a novel list of 90 candidate SNPs. For model validation, genotype and phenotype data were newly collected in 2415 independent subjects (2138 Europeans and 277 non-Europeans) by applying two targeted massively parallel sequencing platforms, Ion Torrent PGM and MiSeq, or the MassARRAY platform. A binomial model was developed to predict straight vs. non-straight hair based on 32 SNPs from 26 genetic loci we identified as significantly contributing to the model. This model achieved prediction accuracies, expressed as AUC, of 0.664 in Europeans and 0.789 in non-Europeans; the statistically significant difference was explained mostly by the effect of one EDAR SNP in non-Europeans. Considering sex and age, in addition to the SNPs, slightly and insignificantly increased the prediction accuracies (AUC of 0.680 and 0.800, respectively). Based on the sample size and candidate DNA markers investigated, this study provides the most robust, validated, and accurate statistical prediction models and SNP predictor marker sets currently available for predicting head hair shape from DNA, providing the next step towards broadening Forensic DNA Phenotyping beyond pigmentation traits.


Asunto(s)
ADN/genética , Cabello , Fenotipo , Polimorfismo de Nucleótido Simple , Adulto , Estudio de Asociación del Genoma Completo , Técnicas de Genotipaje/instrumentación , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Modelos Logísticos , Modelos Genéticos , Análisis de Secuencia de ADN
6.
Forensic Sci Int Genet ; 28: 225-236, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28254385

RESUMEN

The ability to estimate the age of the donor from recovered biological material at a crime scene can be of substantial value in forensic investigations. Aging can be complex and is associated with various molecular modifications in cells that accumulate over a person's lifetime including epigenetic patterns. The aim of this study was to use age-specific DNA methylation patterns to generate an accurate model for the prediction of chronological age using data from whole blood. In total, 45 age-associated CpG sites were selected based on their reported age coefficients in a previous extensive study and investigated using publicly available methylation data obtained from 1156 whole blood samples (aged 2-90 years) analysed with Illumina's genome-wide methylation platforms (27K/450K). Applying stepwise regression for variable selection, 23 of these CpG sites were identified that could significantly contribute to age prediction modelling and multiple regression analysis carried out with these markers provided an accurate prediction of age (R2=0.92, mean absolute error (MAE)=4.6 years). However, applying machine learning, and more specifically a generalised regression neural network model, the age prediction significantly improved (R2=0.96) with a MAE=3.3 years for the training set and 4.4 years for a blind test set of 231 cases. The machine learning approach used 16 CpG sites, located in 16 different genomic regions, with the top 3 predictors of age belonged to the genes NHLRC1, SCGN and CSNK1D. The proposed model was further tested using independent cohorts of 53 monozygotic twins (MAE=7.1 years) and a cohort of 1011 disease state individuals (MAE=7.2 years). Furthermore, we highlighted the age markers' potential applicability in samples other than blood by predicting age with similar accuracy in 265 saliva samples (R2=0.96) with a MAE=3.2 years (training set) and 4.0 years (blind test). In an attempt to create a sensitive and accurate age prediction test, a next generation sequencing (NGS)-based method able to quantify the methylation status of the selected 16 CpG sites was developed using the Illumina MiSeq® platform. The method was validated using DNA standards of known methylation levels and the age prediction accuracy has been initially assessed in a set of 46 whole blood samples. Although the resulted prediction accuracy using the NGS data was lower compared to the original model (MAE=7.5years), it is expected that future optimization of our strategy to account for technical variation as well as increasing the sample size will improve both the prediction accuracy and reproducibility.


Asunto(s)
Envejecimiento/genética , Islas de CpG/genética , Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Redes Neurales de la Computación , Adulto , Anciano , ADN/sangre , Epigenómica , Genética Forense , Humanos , Aprendizaje Automático , Persona de Mediana Edad , Saliva/química , Gemelos Monocigóticos/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA