RESUMO
Minors (subjects under the legal age, established at this study at 18 years) benefit from a series of legal rights created to protect them and guarantee their welfare. However, throughout the world there are many minors who have no way to prove they are underaged, leading to a great interest in predicting legal age with the highest possible accuracy. Current methods, mainly involving X-ray analysis, are highly invasive, so new methods to predict legal age are being studied, such as DNA methylation. To further such studies, we created two age prediction models based on five epigenetic markers: cg21572722 (ELOVL2), cg02228185 (ASPA), cg06639320 (FHL2), cg19283806 (CCDC102B) and cg07082267 (no associated gene), that were analysed in blood samples to determine possible limitations regarding DNA methylation as an effective tool for legal age estimation. A wide age range prediction model was created using a broad set of samples (14-94 years) yielding a mean absolute error (MAE) of ±4.32 years. A second model, the constrained age prediction model, was created using a reduced range of samples (14-25 years) yielding an MAE of ±1.54 years. Both models, in addition to Horvath's Skin & Blood epigenetic clock, were evaluated using a test set comprising 732 pairs of 18-year-old twins (N=426 monozygotic (MZ) and N=306 dizygotic (DZ) pairs), representing a relevant age of study. Through analysis of the two former age prediction models, we found that constraining the age of the samples forming the training set around the desired age of study significantly reduced the prediction error (from MAE: ±4.07 and ±4.27 years for MZ and DZ twins, respectively; to ±1.31 and ±1.3 years). However, despite low prediction errors, DNA methylation models are still prone to classify same-aged individuals in different categories (minors or adults), despite each sample belonging to the same twin pair. Additional evaluation of Horvath's Skin & Blood model (391 CpGs) led to similar results in terms of age prediction errors than if using only five epigenetic markers (MAE: ±1.87 and ±1.99 years for MZ and DZ twins, respectively).
RESUMO
DNA methylation has become a biomarker of great interest in the forensic and clinical fields. In criminal investigations, the study of this epigenetic marker has allowed the development of DNA intelligence tools providing information that can be useful for investigators, such as age prediction. Following a similar trend, when the origin of a sample in a criminal scenario is unknown, the inference of an individual's lifestyle such as tobacco use and alcohol consumption could provide relevant information to help in the identification of DNA donors at the crime scene. At the same time, in the clinical domain, prediction of these trends of consumption could allow the identification of people at risk or better identification of the causes of different pathologies. In the present study, DNA methylation data from the UK AIRWAVE study was used to build two binomial logistic models for the inference of smoking and drinking status. A total of 348 individuals (116 non-smokers, 116 former smokers and 116 smokers) plus a total of 237 individuals (79 non-drinkers, 79 moderate drinkers and 79 drinkers) were used for development of tobacco and alcohol consumption prediction models, respectively. The tobacco prediction model was composed of two CpGs (cg05575921 in AHRR and cg01940273) and the alcohol prediction model three CpGs (cg06690548 in SLC7A11, cg0886875 and cg21294714 in MIR4435-2HG), providing correct classifications of 86.49% and 74.26%, respectively. Validation of the models was performed using leave-one-out cross-validation. Additionally, two independent testing sets were also assessed for tobacco and alcohol consumption. Considering that the consumption of these substances could underlie accelerated epigenetic ageing patterns, the effect of these lifestyles on the prediction of age was evaluated. To do that, a quantile regression model based on previous studies was generated, and the potential effect of tobacco and alcohol consumption with the epigenetic age was assessed. The Wilcoxon test was used to evaluate the residuals generated by the model and no significant differences were observed between the categories analyzed.
Assuntos
Metilação de DNA , Fumar , Humanos , Fumar/efeitos adversos , Consumo de Bebidas Alcoólicas/genética , DNA , HábitosRESUMO
Age prediction from DNA has been a topic of interest in recent years due to the promising results obtained when using epigenetic markers. Since DNA methylation gradually changes across the individual's lifetime, prediction models have been developed accordingly for age estimation. The tissue-dependence for this biomarker usually necessitates the development of tissue-specific age prediction models, in this way, multiple models for age inference have been constructed for the most commonly encountered forensic tissues (blood, oral mucosa, semen). The analysis of skeletal remains has also been attempted and prediction models for bone have now been reported. Recently, the VISAGE Enhanced Tool was developed for the simultaneous DNA methylation analysis of 8 age-correlated loci using targeted high-throughput sequencing. It has been shown that this method is compatible with epigenetic age estimation models for blood, buccal cells, and bone. Since when dealing with decomposed cadavers or postmortem samples, cartilage samples are also an important biological source, an age prediction model for cartilage has been generated in the present study based on methylation data collected using the VISAGE Enhanced Tool. In this way, we have developed a forensic cartilage age prediction model using a training set composed of 109 samples (19-74 age range) based on DNA methylation levels from three CpGs in FHL2, TRIM59 and KLF14, using multivariate quantile regression which provides a mean absolute error (MAE) of ± 4.41 years. An independent testing set composed of 72 samples (19-75 age range) was also analyzed and provided an MAE of ± 4.26 years. In addition, we demonstrate that the 8 VISAGE markers, comprising EDARADD, TRIM59, ELOVL2, MIR29B2CHG, PDE4C, ASPA, FHL2 and KLF14, can be used as tissue prediction markers which provide reliable blood, buccal cells, bone, and cartilage differentiation using a developed multinomial logistic regression model. A training set composed of 392 samples (n = 87 blood, n = 86 buccal cells, n = 110 bone and n = 109 cartilage) was used for building the model (correct classifications: 98.72%, sensitivity: 0.988, specificity: 0.996) and validation was performed using a testing set composed of 192 samples (n = 38 blood, n = 36 buccal cells, n = 46 bone and n = 72 cartilage) showing similar predictive success to the training set (correct classifications: 97.4%, sensitivity: 0.968, specificity: 0.991). By developing both a new cartilage age model and a tissue differentiation model, our study significantly expands the use of the VISAGE Enhanced Tool while increasing the amount of DNA methylation-based information obtained from a single sample and a single forensic laboratory analysis. Both models have been placed in the open-access Snipper forensic classification website.
Assuntos
Envelhecimento , Cartilagem Costal , Humanos , Pré-Escolar , Envelhecimento/genética , Mucosa Bucal , Ilhas de CpG , Marcadores Genéticos , Metilação de DNA , Genética Forense/métodos , Epigênese Genética , Proteínas com Motivo Tripartido/genética , Peptídeos e Proteínas de Sinalização Intracelular/genéticaRESUMO
Age estimation based on epigenetic markers is a DNA intelligence tool with the potential to provide relevant information for criminal investigations, as well as to improve the inference of age-dependent physical characteristics such as male pattern baldness or hair color. Age prediction models have been developed based on different tissues, including saliva and buccal cells, which show different methylation patterns as they are composed of different cell populations. On many occasions in a criminal investigation, the origin of a sample or the proportion of tissues is not known with certainty, for example the provenance of cigarette butts, so use of combined models can provide lower prediction errors. In the present study, two tissue-specific and seven age-correlated CpG sites were selected from publicly available data from the Illumina HumanMethylation 450 BeadChip and bibliographic searches, to help build a tissue-dependent, and an age-prediction model, respectively. For the development of both models, a total of 184 samples (N = 91 saliva and N = 93 buccal cells) ranging from 21 to 86 years old were used. Validation of the models was performed using either k-fold cross-validation and an additional set of 184 samples (N = 93 saliva and N = 91 buccal cells, 21-86 years old). The tissue prediction model was developed using two CpG sites (HUNK and RUNX1) based on logistic regression that produced a correct classification rate for saliva and buccal swab samples of 88.59 % for the training set, and 83.69 % for the testing set. Despite these high success rates, a combined age prediction model was developed covering both saliva and buccal cells, using seven CpG sites (cg10501210, LHFPL4, ELOVL2, PDE4C, HOXC4, OTUD7A and EDARADD) based on multivariate quantile regression giving a median absolute error (MAE): ± 3.54 years and a correct classification rate ( %CP±PI) of 76.08 % for the training set, and an MAE of ± 3.66 years and a %CP±PI of 71.19 % for the testing set. The addition of tissue-of origin as a co-variate to the model was assessed, but no improvement was detected in age predictions. Finally, considering the limitations usually faced by forensic DNA analyses, the robustness of the model and the minimum recommended amount of input DNA for bisulfite conversion were evaluated, considering up to 10 ng of genomic DNA for reproducible results. The final multivariate quantile regression age predictor based on the models we developed has been placed in the open-access Snipper forensic classification website.
Assuntos
Subunidade alfa 2 de Fator de Ligação ao Core , Genética Forense , Humanos , Masculino , Adulto Jovem , Adulto , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou mais , Ilhas de CpG , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Genética Forense/métodos , Saliva , Metilação de DNA , Mucosa Bucal , Marcadores Genéticos , Envelhecimento/genética , DNA , Epigênese GenéticaRESUMO
Forensic age estimation is a DNA intelligence tool that forms an important part of Forensic DNA Phenotyping. Criminal cases with no suspects or with unsuccessful matches in searches on DNA databases; human identification analyses in mass disasters; anthropological studies or legal disputes; all benefit from age estimation to gain investigative leads. Several age prediction models have been developed to date based on DNA methylation. Although different DNA methylation technologies as well as diverse statistical methods have been proposed, most of them are based on blood samples and mainly restricted to adult age ranges. In the current study, we present an extended age prediction model based on 895 evenly distributed Spanish DNA blood samples from 2 to 104 years old. DNA methylation levels were detected using Agena Bioscience EpiTYPER® technology for a total of seven CpG sites located at seven genomic regions: ELOVL2, ASPA, PDE4C, FHL2, CCDC102B, MIR29B2CHG and chr16:85395429 (GRCh38). The accuracy of the age prediction system was tested by comparing three statistical methods: quantile regression (QR), quantile regression neural network (QRNN) and quantile regression support vector machine (QRSVM). The most accurate predictions were obtained when using QRNN or QRSVM (mean absolute prediction error, MAE of ± 3.36 and ± 3.41, respectively). Validation of the models with an independent Spanish testing set (N = 152) provided similar accuracies for both methods (MAE: ± 3.32 and ± 3.45, respectively). The main advantage of using quantile regression statistical tools lies in obtaining age-dependent prediction intervals, fitting the error to the estimated age. An additional analysis of dimensionality reduction shows a direct correlation of increased error and a reduction of correct classifications as the training sample size is reduced. Results indicated that a minimum sample size of six samples per year-of-age covered by the training set is recommended to efficiently capture the most inter-individual variability..
Assuntos
Envelhecimento , Genética Forense , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Envelhecimento/genética , Criança , Pré-Escolar , Ilhas de CpG/genética , DNA , Metilação de DNA , Epigênese Genética , Genética Forense/métodos , Humanos , Pessoa de Meia-Idade , Adulto JovemRESUMO
Individual age estimation can be applied to criminal, legal, and anthropological investigations. DNA methylation has been established as the biomarker of choice for age prediction, since it was observed that specific CpG positions in the genome show systematic changes during an individual's lifetime, with progressive increases or decreases in methylation levels. Subsequently, several forensic age prediction models have been reported, providing average age prediction error ranges of ±3-4 years, using a broad spectrum of technologies and underlying statistical analyses. DNA methylation assessment is not categorical but quantitative. Therefore, the detection platform used plays a pivotal role, since quantitative and semi-quantitative technologies could potentially result in differences in detected DNA methylation levels. In the present study, we analyzed as a shared sample pool, 84 blood-based DNA controls ranging from 18 to 99 years old using four different technologies: EpiTYPER®, pyrosequencing, MiSeq, and SNaPshotTM. The DNA methylation levels detected for CpG sites from ELOVL2, FHL2, and MIR29B2 with each system were compared. A restricted three CpG-site age prediction model was rebuilt for each system, as well as for a combination of technologies, based on previous training datasets, and age predictions were calculated accordingly for all the samples detected with the previous technologies. While the DNA methylation patterns and subsequent age predictions from EpiTYPER®, pyrosequencing, and MiSeq systems are largely comparable for the CpG sites studied, SNaPshotTM gives bigger differences reflected in higher predictive errors. However, these differences can be reduced by applying a z-score data transformation.
RESUMO
European flat oyster (Ostrea edulis) production has suffered a severe decline due to bonamiosis. The responsible parasite enters in oyster haemocytes, causing an acute inflammatory response frequently leading to death. We used an immune-enriched oligo-microarray to understand the haemocyte response to Bonamia ostreae by comparing expression profiles between naïve (NS) and long-term affected (AS) populations along a time series (1 d, 30 d, 90 d). AS showed a much higher response just after challenge, which might be indicative of selection for resistance. No regulated genes were detected at 30â¯d in both populations while a notable reactivation was observed at 90 d, suggesting parasite latency during infection. Genes related to extracellular matrix and protease inhibitors, up-regulated in AS, and those related to histones, down-regulated in NS, might play an important role along the infection. Twenty-four candidate genes related to resistance should be further validated for selection programs aimed to control bonamiosis.
Assuntos
Haplosporídios , Hemócitos/metabolismo , Ostrea/genética , Infecções por Protozoários/genética , Transcriptoma , Animais , Regulação da Expressão Gênica , Hemócitos/imunologia , Ostrea/imunologia , Ostrea/metabolismo , Infecções por Protozoários/metabolismoRESUMO
Three approaches applicable to the analysis of forensic ancestry-informative marker data-STRUCTURE, principal component analysis, and the Snipper Bayesian classification system-are reviewed. Detailed step-by-step guidance is provided for adjusting parameter settings in STRUCTURE with particular regard to their effect when differentiating populations. Several enhancements to the Snipper online forensic classification portal are described, highlighting the added functionality they bring to particular aspects of ancestry-informative SNP analysis in a forensic context.
Assuntos
Genética Forense/métodos , Genética Populacional , Humanos , Internet , Polimorfismo de Nucleotídeo Único , SoftwareRESUMO
Individual age estimation has the potential to provide key information that could enhance and extend DNA intelligence tools. Following predictive tests for externally visible characteristics developed in recent years, prediction of age could guide police investigations and improve the assessment of age-related phenotype expression patterns such as hair colour changes and early onset of male pattern baldness. DNA methylation at CpG positions has emerged as the most promising DNA tests to ascertain the individual age of the donor of a biological contact trace. Although different methodologies are available to detect DNA methylation, EpiTYPER technology (Agena Bioscience, formerly Sequenom) provides useful characteristics that can be applied as a discovery tool in localized regions of the genome. In our study, a total of twenty-two candidate genomic regions, selected from the assessment of publically available data from the Illumina HumanMethylation 450 BeadChip, had a total of 177 CpG sites with informative methylation patterns that were subsequently investigated in detail. From the methylation analyses made, a novel age prediction model based on a multivariate quantile regression analysis was built using the seven highest age-correlated loci of ELOVL2, ASPA, PDE4C, FHL2, CCDC102B, C1orf132 and chr16:85395429. The detected methylation levels in these loci provide a median absolute age prediction error of ±3.07years and a percentage of prediction error relative to the age of 6.3%. We report the predictive performance of the developed model using cross validation of a carefully age-graded training set of 725 European individuals and a test set of 52 monozygotic twin pairs. The multivariate quantile regression age predictor, using the CpG sites selected in this study, has been placed in the open-access Snipper forensic classification website.
Assuntos
Envelhecimento/genética , Ilhas de CpG/genética , Metilação de DNA , Marcadores Genéticos , Software , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Loci Gênicos , Humanos , Masculino , Espectrometria de Massas , Pessoa de Meia-Idade , Análise Multivariada , Reação em Cadeia da Polimerase , Gêmeos Monozigóticos/genética , Adulto JovemRESUMO
The turbot is a flatfish with a ZW/ZZ sex determination system but with a still unknown sex determining gene(s), and with a marked sexual growth dimorphism in favor of females. To better understand sexual development in turbot we sampled young turbot encompassing the whole process of gonadal differentiation and conducted a comprehensive transcriptomic study on its sex differentiation using a validated custom oligomicroarray. Also, the expression profiles of 18 canonical reproduction-related genes were studied along gonad development. The expression levels of gonadal aromatase cyp19a1a alone at three months of age allowed the accurate and early identification of sex before the first signs of histological differentiation. A total of 56 differentially expressed genes (DEG) that had not previously been related to sex differentiation in fish were identified within the first three months of age, of which 44 were associated with ovarian differentiation (e.g., cd98, gpd1 and cry2), and 12 with testicular differentiation (e.g., ace, capn8 and nxph1). To identify putative sex determining genes, â¼4.000 DEG in juvenile gonads were mapped and their positions compared with that of previously identified sex- and growth-related quantitative trait loci (QTL). Although no genes mapped to the previously identified sex-related QTLs, two genes (foxl2 and 17ßhsd) of the canonical reproduction-related genes mapped to growth-QTLs in linkage group (LG) 15 and LG6, respectively, suggesting that these genes are related to the growth dimorphism in this species.
Assuntos
Proteínas de Peixes/genética , Linguados/genética , Perfilação da Expressão Gênica/métodos , Gônadas/crescimento & desenvolvimento , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Diferenciação Sexual , Animais , Mapeamento Cromossômico , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Gônadas/metabolismo , Humanos , Masculino , Locos de Características Quantitativas , Caracteres SexuaisRESUMO
In forensic analysis predictive tests for external visible characteristics (or EVCs), including inference of iris color, represent a potentially useful tool to guide criminal investigations. Two recent studies, both focused on forensic testing, have analyzed single nucleotide polymorphism (SNP) genotypes underlying common eye color variation (Mengel-From et al., Forensic Sci. Int. Genet. 4:323 and Walsh et al., Forensic Sci. Int. Genet. 5:170). Each study arrived at different recommendations for eye color predictive tests aiming to type the most closely associated SNPs, although both confirmed rs12913832 in HERC2 as the key predictor, widely recognized as the most strongly associated marker with blue and brown iris colors. Differences between these two studies in identification of other eye color predictors may partly arise from varying approaches to assigning phenotypes, notably those not unequivocally blue or dark brown and therefore occupying an intermediate iris color continuum. We have developed two single base extension assays typing 37 SNPs in pigmentation-associated genes to study SNP-genotype based prediction of eye, skin, and hair color variation. These assays were used to test the performance of different sets of eye color predictors in 416 subjects from six populations of north and south Europe. The presence of a complex and continuous range of intermediate phenotypes distinct from blue and brown eye colors was confirmed by establishing eye color populations compared to genetic clusters defined using Structure software. Our study explored the effect of an expanded SNP combination beyond six markers has on the ability to predict eye color in a forensic test without extending the SNP assay excessively - thus maintaining a balance between the test's predictive value and an ability to reliably type challenging DNA with a multiplex of manageable size. Our evaluation used AUC analysis (area under the receiver operating characteristic curves) and naïve Bayesian likelihood-based classification approaches. To provide flexibility in SNP-based eye color predictive tests in forensic applications we modified an online Bayesian classifier, originally developed for genetic ancestry analysis, to provide a straightforward system to assign eye color likelihoods from a SNP profile combining additional informative markers from the predictors analyzed by our study plus those of Walsh and Mengel-From. Two advantages of the online classifier is the ability to submit incomplete SNP profiles, a common occurrence when typing challenging DNA, and the ability to handle physically linked SNPs showing independent effect, by allowing the user to input frequencies from SNP pairs or larger combinations. This system was used to include the submission of frequency data for the SNP pair rs12913832 and rs1129038: indicated by our study to be the two SNPs most closely associated to eye color.
Assuntos
Cor de Olho/genética , Genética Forense , Sequência de Bases , Primers do DNA , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Seleção GenéticaRESUMO
The CEPH human genome diversity cell line panel (CEPH-HGDP) of 51 globally distributed populations was used to analyze patterns of variability in 20 core human identification STRs. The markers typed comprised the 15 STRs of Identifiler, one of the most widely used forensic STR multiplexes, plus five recently introduced European Standard Set (ESS) STRs: D1S1656, D2S441, D10S1248, D12S391 and D22S1045. From the genotypes obtained for the ESS STRs we identified rare, intermediate or off-ladder alleles that had not been previously reported for these loci. Examples of novel ESS STR alleles found were characterized by sequence analysis. This revealed extensive repeat structure variation in three ESS STRs, with D12S391 showing particularly high variability for tandem runs of AGAT and AGAC repeat units. The global geographic distribution of the CEPH panel samples gave an opportunity to study in detail the extent of substructure shown by the 20 STRs amongst populations and between their parent population groups. An assessment was made of the forensic informativeness of the new ESS STRs compared to the loci they will replace: CSF1PO, D5S818, D7S820, D13S317 and TPOX, with results showing a clear enhancement of discrimination power using multiplexes that genotype the new ESS loci. We also measured the ability of Identifiler and ESS STRs to infer the ancestry of the CEPH-HGDP samples and demonstrate that forensic STRs in large multiplexes have the potential to differentiate the major population groups but only with sufficient reliability when used with other ancestry-informative markers such as single nucleotide polymorphisms. Finally we checked for possible association by linkage between the two ESS multiplex STRs closely positioned on chromosome-12: vWA and D12S391 by examining paired genotypes from the complete CEPH data set.
Assuntos
Variação Genética , Genoma Humano , Repetições de Microssatélites/genética , Sequência de Bases , Primers do DNA , Europa (Continente) , Genética Forense , Frequência do Gene , Marcadores Genéticos , HumanosRESUMO
Tests that infer the ancestral origin of a DNA sample have considerable potential in the development of forensic tools that can help to guide crime investigation. We have developed a single-tube 34-plex SNP assay for the assignment of ancestral origin by choosing ancestry-informative markers (AIMs) exhibiting highly contrasting allele frequency distributions between the three major population-groups. To predict ancestral origin from the profiles obtained, a classification algorithm was developed based on maximum likelihood. Sampling of two populations each from African, European and East Asian groups provided training sets for the algorithm and this was tested using the CEPH Human Genome Diversity Panel. We detected negligible theoretical and practical error for assignments to one of the three groups analyzed with consistently high classification probabilities, even when using reduced subsets of SNPs. This study shows that by choosing SNPs exhibiting marked allele frequency differences between population-groups a practical forensic test for assigning the most likely ancestry can be achieved from a single multiplexed assay.