Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Methods Mol Biol ; 2432: 187-200, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35505216

RESUMO

Recent research studies using epigenetic data have been exploring whether it is possible to estimate how old someone is using only their DNA. This application stems from the strong correlation that has been observed in humans between the methylation status of certain DNA loci and chronological age. While genome-wide methylation sequencing has been the most prominent approach in epigenetics research, recent studies have shown that targeted sequencing of a limited number of loci can be successfully used for the estimation of chronological age from DNA samples, even when using small datasets. Following this shift, the need to investigate further into the appropriate statistics behind the predictive models used for DNA methylation-based prediction has been identified in multiple studies. This chapter will look into an example of basic data manipulation and modeling that can be applied to small DNA methylation datasets (100-400 samples) produced through targeted methylation sequencing for a small number of predictors (10-25 methylation sites). Data manipulation will focus on converting the obtained methylation values for the different predictors to a statistically meaningful dataset, followed by a basic introduction into importing such datasets in R, as well as randomizing and splitting into appropriate training and test sets for modeling. Finally, a basic introduction to R modeling will be outlined, starting with feature selection algorithms and continuing with a simple modeling example (linear model) as well as a more complex algorithm (Support Vector Machine).


Assuntos
Metilação de DNA , Epigênese Genética , DNA/genética , Humanos , Aprendizado de Máquina , Máquina de Vetores de Suporte
2.
Forensic Sci Int Genet ; 57: 102637, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34852982

RESUMO

The estimation of chronological age from biological fluids has been an important quest for forensic scientists worldwide, with recent approaches exploiting the variability of DNA methylation patterns with age in order to develop the next generation of forensic 'DNA intelligence' tools for this application. Drawing from the conclusions of previous work utilising massively parallel sequencing (MPS) for this analysis, this work introduces a DNA methylation-based age estimation method for blood that exhibits the best combination of prediction accuracy and sensitivity reported to date. Statistical evaluation of markers from 51 studies using microarray data from over 4000 individuals, followed by validation using in-house generated MPS data, revealed a final set of 11 markers with the greatest potential for accurate age estimation from minimal DNA material. Utilising an algorithm based on support vector machines, the proposed model achieved an average error (MAE) of 3.3 years, with this level of accuracy retained down to 5 ng of starting DNA input (~ 1 ng PCR input). The accuracy of the model was retained (MAE = 3.8 years) in a separate test set of 88 samples of Spanish origin, while predictions for donors of greater forensic interest (< 55 years of age) displayed even higher accuracy (MAE = 2.6 years). Finally, no sex-related bias was observed for this model, while there were also no signs of variation observed between control and disease-associated populations for schizophrenia, rheumatoid arthritis, frontal temporal dementia and progressive supranuclear palsy in microarray data relating to the 11 markers.


Assuntos
Envelhecimento , Metilação de DNA , Envelhecimento/genética , Pré-Escolar , Ilhas de CpG/genética , DNA/genética , Genética Forense , Humanos , Inteligência
3.
Forensic Sci Int Genet ; 34: e1-e6, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29506869

RESUMO

A total of 3128 Y-STR profiles from three UK and one Irish population have been analysed with the PowerPlex Y23 system and are reported here. Instances of haplotype sharing between apparently unrelated individuals were identified and further investigated with the use of the 5 additional markers within the Yfiler Plus kit, resulting in a reduction by 76% in the number of shared haplotypes. Furthermore, Yfiler Plus was also employed to verify locus deletions and duplications observed in Y23 genotypes while inconsistencies between the two kits were sequenced, revealing underlying Y23 primer binding site mutations in loci DYS392 and DYS576. Finally, the mechanism behind a previously reported population specific peak shift observed in DYS481 in South Asian samples has been evaluated and further investigated in a novel case of this phenomenon seen in a Black British individual featuring a different flanking region mutation.


Assuntos
Cromossomos Humanos Y , Genética Populacional , Repetições de Microssatélites , Deleção Cromossômica , Duplicação Cromossômica , Impressões Digitais de DNA , Genótipo , Haplótipos , Humanos , Irlanda , Grupos Raciais/genética , Reino Unido
4.
Forensic Sci Int Genet ; 37: 215-226, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30243148

RESUMO

The field of DNA intelligence focuses on retrieving information from DNA evidence that can help narrow down large groups of suspects or define target groups of interest. With recent breakthroughs on the estimation of geographical ancestry and physical appearance, the estimation of chronological age comes to complete this circle of information. Recent studies have identified methylation sites in the human genome that correlate strongly with age and can be used for the development of age-estimation algorithms. In this study, 110 whole blood samples from individuals aged 11-93 years were analysed using a DNA methylation quantification assay based on bisulphite conversion and massively parallel sequencing (Illumina MiSeq) of 12 CpG sites. Using this data, 17 different statistical modelling approaches were compared based on root mean square error (RMSE) and a Support Vector Machine with polynomial function (SVMp) model was selected for further testing. For the selected model (RMSE = 4.9 years) the mean average error (MAE) of the blind test (n = 33) was calculated at 4.1 years, with 52% of the samples predicting with less than 4 years of error and 86% with less than 7 years. Furthermore, the sensitivity of the method was assessed both in terms of methylation quantification accuracy and prediction accuracy in the first validation of this kind. The described method retained its accuracy down to 10 ng of initial DNA input or ∼2 ng bisulphite PCR input. Finally, 34 saliva samples were analysed and following basic normalisation, the chronological age of the donors was predicted with less than 4 years of error for 50% of the samples and with less than 7 years of error for 70%.


Assuntos
Envelhecimento/genética , Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Análise Química do Sangue , Criança , Ilhas de CpG/genética , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Reação em Cadeia da Polimerase Multiplex , Redes Neurais de Computação , Reprodutibilidade dos Testes , Saliva/química , Sêmen/química , Análise de Sequência de DNA , Sulfitos , Máquina de Vetores de Suporte , Adulto Jovem
5.
Forensic Sci Int Genet ; 37: 241-251, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30268682

RESUMO

Human head hair shape, commonly classified as straight, wavy, curly or frizzy, is an attractive target for Forensic DNA Phenotyping and other applications of human appearance prediction from DNA such as in paleogenetics. The genetic knowledge underlying head hair shape variation was recently improved by the outcome of a series of genome-wide association and replication studies in a total of 26,964 subjects, highlighting 12 loci of which 8 were novel and introducing a prediction model for Europeans based on 14 SNPs. In the present study, we evaluated the capacity of DNA-based head hair shape prediction by investigating an extended set of candidate SNP predictors and by using an independent set of samples for model validation. Prediction model building was carried out in 9674 subjects (6068 from Europe, 2899 from Asia and 707 of admixed European and Asian ancestries), used previously, by considering a novel list of 90 candidate SNPs. For model validation, genotype and phenotype data were newly collected in 2415 independent subjects (2138 Europeans and 277 non-Europeans) by applying two targeted massively parallel sequencing platforms, Ion Torrent PGM and MiSeq, or the MassARRAY platform. A binomial model was developed to predict straight vs. non-straight hair based on 32 SNPs from 26 genetic loci we identified as significantly contributing to the model. This model achieved prediction accuracies, expressed as AUC, of 0.664 in Europeans and 0.789 in non-Europeans; the statistically significant difference was explained mostly by the effect of one EDAR SNP in non-Europeans. Considering sex and age, in addition to the SNPs, slightly and insignificantly increased the prediction accuracies (AUC of 0.680 and 0.800, respectively). Based on the sample size and candidate DNA markers investigated, this study provides the most robust, validated, and accurate statistical prediction models and SNP predictor marker sets currently available for predicting head hair shape from DNA, providing the next step towards broadening Forensic DNA Phenotyping beyond pigmentation traits.


Assuntos
DNA/genética , Cabelo , Fenótipo , Polimorfismo de Nucleotídeo Único , Adulto , Estudo de Associação Genômica Ampla , Técnicas de Genotipagem/instrumentação , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Logísticos , Modelos Genéticos , Análise de Sequência de DNA
6.
Forensic Sci Int Genet ; 28: 225-236, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28254385

RESUMO

The ability to estimate the age of the donor from recovered biological material at a crime scene can be of substantial value in forensic investigations. Aging can be complex and is associated with various molecular modifications in cells that accumulate over a person's lifetime including epigenetic patterns. The aim of this study was to use age-specific DNA methylation patterns to generate an accurate model for the prediction of chronological age using data from whole blood. In total, 45 age-associated CpG sites were selected based on their reported age coefficients in a previous extensive study and investigated using publicly available methylation data obtained from 1156 whole blood samples (aged 2-90 years) analysed with Illumina's genome-wide methylation platforms (27K/450K). Applying stepwise regression for variable selection, 23 of these CpG sites were identified that could significantly contribute to age prediction modelling and multiple regression analysis carried out with these markers provided an accurate prediction of age (R2=0.92, mean absolute error (MAE)=4.6 years). However, applying machine learning, and more specifically a generalised regression neural network model, the age prediction significantly improved (R2=0.96) with a MAE=3.3 years for the training set and 4.4 years for a blind test set of 231 cases. The machine learning approach used 16 CpG sites, located in 16 different genomic regions, with the top 3 predictors of age belonged to the genes NHLRC1, SCGN and CSNK1D. The proposed model was further tested using independent cohorts of 53 monozygotic twins (MAE=7.1 years) and a cohort of 1011 disease state individuals (MAE=7.2 years). Furthermore, we highlighted the age markers' potential applicability in samples other than blood by predicting age with similar accuracy in 265 saliva samples (R2=0.96) with a MAE=3.2 years (training set) and 4.0 years (blind test). In an attempt to create a sensitive and accurate age prediction test, a next generation sequencing (NGS)-based method able to quantify the methylation status of the selected 16 CpG sites was developed using the Illumina MiSeq® platform. The method was validated using DNA standards of known methylation levels and the age prediction accuracy has been initially assessed in a set of 46 whole blood samples. Although the resulted prediction accuracy using the NGS data was lower compared to the original model (MAE=7.5years), it is expected that future optimization of our strategy to account for technical variation as well as increasing the sample size will improve both the prediction accuracy and reproducibility.


Assuntos
Envelhecimento/genética , Ilhas de CpG/genética , Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Redes Neurais de Computação , Adulto , Idoso , DNA/sangue , Epigenômica , Genética Forense , Humanos , Aprendizado de Máquina , Pessoa de Meia-Idade , Saliva/química , Gêmeos Monozigóticos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA