Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 272
Filter
1.
Article in English | MEDLINE | ID: mdl-39360788

ABSTRACT

BACKGROUND: Perceived age (PA) has been associated with mortality, genetic variants linked to ageing and several age-related morbidities. However, estimating PA in large datasets is laborious and costly to generate, limiting its practical applicability. OBJECTIVES: To determine if estimating PA using deep learning-based algorithms results in the same associations with morbidities and genetic variants as human-estimated perceived age. METHODS: Self-supervised learning (SSL) and deep feature transfer (DFT) deep learning (DL) approaches were trained and tested on human-estimated PAs and their corresponding frontal face images of middle-aged to elderly Dutch participants (n = 2679) from a population-based study in the Netherlands. We compared the DL-estimated PAs with morbidities previously associated with human-estimated PA as well as genetic variants in the gene MC1R; we additionally tested the PA associations with MC1R in a new validation cohort (n = 1158). RESULTS: The DL approaches predicted PA in this population with a mean absolute error of 2.84 years (DFT) and 2.39 years (SSL). In the training-test dataset, we found the same significant (p < 0.05) associations for DL PA with osteoporosis, ARHL, cognition, COPD and cataracts and MC1R, as with human PA. We also found a similar but less significant association for SSL and DFT PAs (0.69 and 0.71 years per allele, p = 0.008 and 0.011, respectively) with MC1R variants in the validation dataset as that found with human, SSL and DFT PAs in the training-test dataset (0.79, 0.78 and 0.71 years per allele respectively; all p < 0.0001). CONCLUSIONS: Deep learning methods can automatically estimate PA from facial images with enough accuracy to replicate known links between human-estimated perceived age and several age-related morbidities. Furthermore, DL predicted perceived age associated with MC1R gene variants in a validation cohort. Hence, such DL PA techniques may be used instead of human estimations in perceived age studies thereby reducing time and costs.

2.
J Law Biosci ; 11(2): lsae017, 2024.
Article in English | MEDLINE | ID: mdl-39239310

ABSTRACT

Although national criminal offender DNA databases (NCODDs) including autosomal short tandem repeats (STRs) have been a successful tool to identify criminals for decades in many countries, yet there are many criminal cases they cannot solve. In cases with mixed male-female samples, particularly sexual assault, expanding NCODDs with Y-chromosomal STR (Y-STR) profiles allows database matching in the absence of autosomal STR profiles. Although Y-STR matches are not individual-specific, this can be largely overcome with rapidly mutating Y-STRs (RM Y-STR) allowing separation of paternally related men. Expanding NCODDs with Y-STR profiles is also beneficial for law enforcement in cases without known suspects via familial searching. Expanding NCODDs with Y-STR profiles may raise concerns about genetic privacy and fundamental human rights. A legal analysis of the European Convention on Human Rights revealed that when primarily for reidentifying convicted sex offenders, it would be in line with the case law of the European Court of Human Rights, while a generalized approach primarily for familial searching and involving all types of offenders may not. This paper aims to stimulate a debate among various stakeholders regarding the benefits and risks of expanding NCODDs with Y-STR profiles that in some countries has already been practically implemented.

5.
Biom J ; 66(4): e2300090, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38813859

ABSTRACT

Linear regression (LR) is vastly used in data analysis for continuous outcomes in biomedicine and epidemiology. Despite its popularity, LR is incompatible with missing data, which frequently occur in health sciences. For parameter estimation, this shortcoming is usually resolved by complete-case analysis or imputation. Both work-arounds, however, are inadequate for prediction, since they either fail to predict on incomplete records or ignore missingness-induced reduction in prediction accuracy and rely on (unrealistic) assumptions about the missing mechanism. Here, we derive adaptive predictor-set linear model (aps-lm), capable of making predictions for incomplete data without the need for imputation. It is derived by using a predictor-selection operation, the Moore-Penrose pseudoinverse, and the reduced QR decomposition. aps-lm is an LR generalization that inherently handles missing values. It is applied on a reference data set, where complete predictors and outcome are available, and yields a set of privacy-preserving parameters. In a second stage, these are shared for making predictions of the outcome on external data sets with missing entries for predictors without imputation. Moreover, aps-lm computes prediction errors that account for the pattern of missing values even under extreme missingness. We benchmark aps-lm in a simulation study. aps-lm showed greater prediction accuracy and reduced bias compared to popular imputation strategies under a wide range of scenarios including variation of sample size, goodness of fit, missing value type, and covariance structure. Finally, as a proof-of-principle, we apply aps-lm in the context of epigenetic aging clocks, linear models that predict a person's biological age from epigenetic data with promising clinical applications.


Subject(s)
Biometry , Linear Models , Biometry/methods , Humans
6.
EClinicalMedicine ; 71: 102550, 2024 May.
Article in English | MEDLINE | ID: mdl-38545426

ABSTRACT

Background: Efficient identification of individuals at high risk of skin cancer is crucial for implementing personalized screening strategies and subsequent care. While Artificial Intelligence holds promising potential for predictive analysis using image data, its application for skin cancer risk prediction utilizing facial images remains unexplored. We present a neural network-based explainable artificial intelligence (XAI) approach for skin cancer risk prediction based on 2D facial images and compare its efficacy to 18 established skin cancer risk factors using data from the Rotterdam Study. Methods: The study employed data from the Rotterdam population-based study in which both skin cancer risk factors and 2D facial images and the occurrence of skin cancer were collected from 2010 to 2018. We conducted a deep-learning survival analysis based on 2D facial images using our developed XAI approach. We subsequently compared these results with survival analysis based on skin cancer risk factors using cox proportional hazard regression. Findings: Among the 2810 participants (mean Age = 68.5 ± 9.3 years, average Follow-up = 5.0 years), 228 participants were diagnosed with skin cancer after photo acquisition. Our XAI approach achieved superior predictive accuracy based on 2D facial images (c-index = 0.72, 95% CI: 0.70-0.74), outperforming that of the known risk factors (c-index = 0.59, 95% CI 0.57-0.61). Interpretation: This proof-of-concept study underscores the high potential of harnessing facial images and a tailored XAI approach as an easily accessible alternative over known risk factors for identifying individuals at high risk of skin cancer. Funding: The Rotterdam Study is funded through unrestricted research grants from Erasmus Medical Center and Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. G.V. Roshchupkin is supported by the ZonMw Veni grant (Veni, 549 1936320).

7.
Forensic Sci Int Genet ; 71: 103030, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38513339

ABSTRACT

The genetic characterization and identification of individuals who contributed to biological mixtures are complex and mostly unresolved tasks. These tasks are relevant in various fields, particularly in forensic investigations, which frequently encounters crime scene stains generated by more than one person. Currently, forensic mixture deconvolution is mostly performed subsequent to forensic DNA profiling at the level of the mixed DNA profiles, which comes with several limitations. Some previous studies attempted at separating single cells prior to forensic DNA profiling. However, these approaches are biased at selection of the cells and, due to their targeted DNA analysis on low template DNA, provide incomplete and unreliable forensic DNA profiles. We recently demonstrated the feasibility of performing mixture deconvolution prior to forensic DNA profiling through the utilization of a non-targeted single-cell transcriptome sequencing (scRNA-seq). In addition to individual-specific mixture deconvolution, this approach also allowed accurate characterisation of biological sex, biogeographic ancestry and individual identification of the separated mixture contributors. However, RNA has the forensic disadvantage of being prone to degradation, and sequencing RNA - focussing on coding regions - limits the number of single nucleotide polymorphisms (SNPs) utilized for genetic mixture deconvolution, characterization, and identification. These limitations can be overcome by performing single-cell sequencing on the level of DNA instead of RNA. Here, for the first time, we applied non-targeted single-cell DNA sequencing (scDNA-seq) by applying the scATAC-seq (Assay for Transposase-Accessible Chromatin with sequencing) technique to address the challenges of mixture deconvolution in the forensic context. We demonstrated that scATAC-seq, together with our recently developed De-goulash data analysis pipeline, is capable of deconvoluting complex blood mixtures of five individuals from both sexes with varying biogeographic ancestries. We further showed that our approach achieved correct genetic characterization of the biological sex and the biogeographic ancestry of each of the separated mixture contributors and established their identity. Furthermore, by analysing in-silico generated scATAC-seq data mixtures, we demonstrated successful individual-specific mixture deconvolution of i) highly complex mixtures of 11 individuals, ii) balanced mixtures containing as few as 20 cells (10 per each individual), and iii) imbalanced mixtures with a ratio as low as 1:80. Overall, our proof-of-principle study demonstrates the general feasibility of scDNA-seq in general, and scATAC-seq in particular, for mixture deconvolution, genetic characterization and individual identification of the separated mixture contributors. Furthermore, it shows that compared to scRNA-seq, scDNA-seq detects more SNPs from fewer cells, providing higher sensitivity, that is valuable in forensic genetics.


Subject(s)
DNA Fingerprinting , Polymorphism, Single Nucleotide , Single-Cell Analysis , Humans , Sequence Analysis, DNA , Female , Male , Forensic Genetics/methods , DNA/genetics
8.
Genes (Basel) ; 15(2)2024 02 10.
Article in English | MEDLINE | ID: mdl-38397216

ABSTRACT

Y-chromosomal short tandem repeats (Y-STRs) are widely used in forensic, genealogical, and population genetics. With the recent increase in the number of rapidly mutating (RM) Y-STRs, an unprecedented level of male differentiation can be achieved, widening and improving the applications of Y-STRs in various fields, including forensics. The growing complexity of Y-STR data increases the need for automated data analyses, but dedicated software tools are scarce. To address this, we present the Male Pedigree Toolbox (MPT), a software tool for the automated analysis of Y-STR data in the context of patrilineal genealogical relationships. The MPT can estimate mutation rates and male relative differentiation rates from input Y-STR pedigree data. It can aid in determining ancestral haplotypes within a pedigree and visualize the genetic variation within pedigrees in all branches of family trees. Additionally, it can provide probabilistic classifications using machine learning, helping to establish or prove the structure of the pedigree and the level of relatedness between males, even for closely related individuals with highly similar haplotypes. The tool is flexible and easy to use and can be adjusted to any set of Y-STR markers by modifying the intuitive input file formats. We introduce the MPT software tool v1.0 and make it publicly available with the goal of encouraging and supporting forensic, genealogical, and other geneticists in utilizing the full potential of Y-STRs for both research purposes and practical applications, including criminal casework.


Subject(s)
Genetics, Population , Mutation Rate , Male , Humans , Pedigree , Haplotypes/genetics , Chromosomes, Human, Y/genetics
10.
Bioinform Adv ; 3(1): vbad176, 2023.
Article in English | MEDLINE | ID: mdl-38075477

ABSTRACT

Motivation: We introduce SMapper, a novel web and software tool for visualizing spatial prevalence data of all types including those suffering from incomplete geographic coverage and insufficient sample sizes. We demonstrate the benefits of our tool in overcoming interpretational issues with existing tools caused by such data limitations. We exemplify the use of SMapper by applications to human genotype and phenotype data relevant in an epidemiological, anthropological and forensic context. Availability and implementation: A web implementation is available at https://rhodos.ccg.uni-koeln.de/smapper/. A stand-alone version, released under the GNU General Public License version 3 as published by the Free Software Foundation, is available from https://rhodos.ccg.uni-koeln.de/smapper/software-download.php as a Singularity container (https://docs.sylabs.io/guides/latest/user-guide/index.html) and a native Linux Python installation.

11.
PLoS Genet ; 19(7): e1010786, 2023 07.
Article in English | MEDLINE | ID: mdl-37459304

ABSTRACT

Human ear morphology, a complex anatomical structure represented by a multidimensional set of correlated and heritable phenotypes, has a poorly understood genetic architecture. In this study, we quantitatively assessed 136 ear morphology traits using deep learning analysis of digital face images in 14,921 individuals from five different cohorts in Europe, Asia, and Latin America. Through GWAS meta-analysis and C-GWASs, a recently introduced method to effectively combine GWASs of many traits, we identified 16 genetic loci involved in various ear phenotypes, eight of which have not been previously associated with human ear features. Our findings suggest that ear morphology shares genetic determinants with other surface ectoderm-derived traits such as facial variation, mono eyebrow, and male pattern baldness. Our results enhance the genetic understanding of human ear morphology and shed light on the shared genetic contributors of different surface ectoderm-derived phenotypes. Additionally, gene editing experiments in mice have demonstrated that knocking out the newly ear-associated gene (Intu) and a previously ear-associated gene (Tbx15) causes deviating mouse ear morphology.


Subject(s)
Genetic Loci , Genome-Wide Association Study , Humans , Male , Animals , Mice , Genome-Wide Association Study/methods , Phenotype , Asia , Polymorphism, Single Nucleotide/genetics
13.
Forensic Sci Int Genet ; 65: 102878, 2023 07.
Article in English | MEDLINE | ID: mdl-37116245

ABSTRACT

Tobacco smoking is a frequent habit sustained by > 1.3 billion people in 2020 and the leading preventable factor for health risk and premature mortality worldwide. In the forensic context, predicting smoking habits from biological samples may allow broadening DNA phenotyping. In this study, we aimed to implement previously published smoking habit classification models based on blood DNA methylation at 13 CpGs. First, we developed a matching lab tool based on bisulfite conversion and multiplex PCR followed by amplification-free library preparation and targeted paired-end massively parallel sequencing (MPS). Analysis of six technical duplicates revealed high reproducibility of methylation measurements (Pearson correlation of 0.983). Artificially methylated standards uncovered marker-specific amplification bias, which we corrected via bi-exponential models. We then applied our MPS tool to 232 blood samples from Europeans of a wide age range, of which 90 were current, 71 former and 71 never smokers. On average, we obtained 189,000 reads/sample and 15,000 reads/CpG, without marker drop-out. Methylation distributions per smoking category roughly corresponded to previous microarray analysis, showcasing large inter-individual variation but with technology-driven bias. Methylation at 11 out of 13 smoking-CpGs correlated with daily cigarettes in current smokers, while solely one was weakly correlated with time since cessation in former smokers. Interestingly, eight smoking-CpGs correlated with age, and one displayed weak but significant sex-associated methylation differences. Using bias-uncorrected MPS data, smoking habits were relatively accurately predicted using both two- (current/non-current) and three- (never/former/current) category model, but bias correction resulted in worse prediction performance for both models. Finally, to account for technology-driven variation, we built new, joint models with inter-technology corrections, which resulted in improved prediction results for both models, with or without PCR bias correction (e.g. MPS cross-validation F1-score > 0.8; 2-categories). Overall, our novel assay takes us one step closer towards the forensic application of viable smoking habit prediction from blood traces. However, future research is needed towards forensically validating the assay, especially in terms of sensitivity. We also need to further shed light on the employed biomarkers, particularly on the mechanistics, tissue specificity and putative confounders of smoking epigenetic signatures.


Subject(s)
DNA Methylation , Smoking , Humans , Reproducibility of Results , Smoking/genetics , Polymerase Chain Reaction , High-Throughput Nucleotide Sequencing , CpG Islands/genetics
15.
Proc Natl Acad Sci U S A ; 120(18): e2212685120, 2023 05 02.
Article in English | MEDLINE | ID: mdl-37094145

ABSTRACT

Circadian rhythms influence physiology, metabolism, and molecular processes in the human body. Estimation of individual body time (circadian phase) is therefore highly relevant for individual optimization of behavior (sleep, meals, sports), diagnostic sampling, medical treatment, and for treatment of circadian rhythm disorders. Here, we provide a partial least squares regression (PLSR) machine learning approach that uses plasma-derived metabolomics data in one or more samples to estimate dim light melatonin onset (DLMO) as a proxy for circadian phase of the human body. For this purpose, our protocol was aimed to stay close to real-life conditions. We found that a metabolomics approach optimized for either women or men under entrained conditions performed equally well or better than existing approaches using more labor-intensive RNA sequencing-based methods. Although estimation of circadian body time using blood-targeted metabolomics requires further validation in shift work and other real-world conditions, it currently may offer a robust, feasible technique with relatively high accuracy to aid personalized optimization of behavior and clinical treatment after appropriate validation in patient populations.


Subject(s)
Human Body , Melatonin , Male , Humans , Female , Light , Circadian Rhythm/physiology , Sleep/physiology , Melatonin/metabolism , Metabolomics
16.
Forensic Sci Int Genet ; 65: 102870, 2023 07.
Article in English | MEDLINE | ID: mdl-37084623

ABSTRACT

Forensic DNA Phenotyping (FDP) comprises the prediction of a person's externally visible characteristics regarding appearance, biogeographic ancestry and age from DNA of crime scene samples, to provide investigative leads to help find unknown perpetrators that cannot be identified with forensic STR-profiling. In recent years, FDP has advanced considerably in all of its three components, which we summarize in this review article. Appearance prediction from DNA has broadened beyond eye, hair and skin color to additionally comprise other traits such as eyebrow color, freckles, hair structure, hair loss in men, and tall stature. Biogeographic ancestry inference from DNA has progressed from continental ancestry to sub-continental ancestry detection and the resolving of co-ancestry patterns in genetically admixed individuals. Age estimation from DNA has widened beyond blood to more somatic tissues such as saliva and bones as well as new markers and tools for semen. Technological progress has allowed forensically suitable DNA technology with largely increased multiplex capacity for the simultaneous analysis of hundreds of DNA predictors with targeted massively parallel sequencing (MPS). Forensically validated MPS-based FDP tools for predicting from crime scene DNA i) several appearance traits, ii) multi-regional ancestry, iii) several appearance traits together with multi-regional ancestry, and iv) age from different tissue types, are already available. Despite recent advances that will likely increase the impact of FDP in criminal casework in the near future, moving reliable appearance, ancestry and age prediction from crime scene DNA to the level of detail and accuracy police investigators may desire, requires further intensified scientific research together with technical developments and forensic validations as well as the necessary funding.


Subject(s)
DNA , Forensic Genetics , Humans , Phenotype , DNA/genetics , Forensic Medicine , Skin Pigmentation , Polymorphism, Single Nucleotide , Eye Color
18.
Commun Biol ; 6(1): 201, 2023 02 20.
Article in English | MEDLINE | ID: mdl-36805025

ABSTRACT

Identifying individuals from biological mixtures to which they contributed is highly relevant in crime scene investigation and various biomedical research fields, but despite previous attempts, remains nearly impossible. Here we investigated the potential of using single-cell transcriptome sequencing (scRNA-seq), coupled with a dedicated bioinformatics pipeline (De-goulash), to solve this long-standing problem. We developed a novel approach and tested it with scRNA-seq data that we de-novo generated from multi-person blood mixtures, and also in-silico mixtures we assembled from public single individual scRNA-seq datasets, involving different numbers, ratios, and bio-geographic ancestries of contributors. For all 2 up to 9-person balanced and imbalanced blood mixtures with ratios up to 1:60, we achieved a clear single-cell separation according to the contributing individuals. For all separated mixture contributors, sex and bio-geographic ancestry (maternal, paternal, and bi-parental) were correctly determined. All separated contributors were correctly individually identified with court-acceptable statistical certainty using de-novo generated whole exome sequencing reference data. In this proof-of-concept study, we demonstrate the feasibility of single-cell approaches to deconvolute biological mixtures and subsequently genetically characterise, and individually identify the separated mixture contributors. With further optimisation and implementation, this approach may eventually allow moving to challenging biological mixtures, including those found at crime scenes.


Subject(s)
Parents , Transcriptome , Humans , Exome Sequencing , Cell Separation
19.
Br J Dermatol ; 188(3): 390-395, 2023 02 22.
Article in English | MEDLINE | ID: mdl-36763776

ABSTRACT

BACKGROUND: Looking older for one's chronological age is associated with a higher mortality rate. Yet it remains unclear how perceived facial age relates to morbidity and the degree to which facial ageing reflects systemic ageing of the human body. OBJECTIVES: To investigate the association between ΔPA and age-related morbidities of different organ systems, where ΔPA represents the difference between perceived age (PA) and chronological age. METHODS: We performed a cross-sectional analysis on data from the Rotterdam Study, a population-based cohort study in the Netherlands. High-resolution facial photographs of 2679 men and women aged 51.5-87.8 years of European descent were used to assess PA. PA was estimated and scored in 5-year categories using these photographs by a panel of men and women who were blinded for chronological age and medical history. A linear mixed model was used to generate the mean PAs. The difference between the mean PA and chronological age was calculated (ΔPA), where a higher (positive) ΔPA means that the person looks younger for their age and a lower (negative) ΔPA that the person looks older. ΔPA was tested as a continuous variable for association with ageing-related morbidities including cardiovascular, pulmonary, ophthalmological, neurocognitive, renal, skeletal and auditory morbidities in separate regression analyses, adjusted for age and sex (model 1) and additionally for body mass index, smoking and sun exposure (model 2). RESULTS: We observed 5-year higher ΔPA (i.e. looking younger by 5 years for one's age) to be associated with less osteoporosis [odds ratio (OR) 0.76, 95% confidence interval (CI) 0.62-0.93], less chronic obstructive pulmonary disease (OR 0.85, 95% CI 0.77-0.95), less age-related hearing loss (model 2; B = -0.76, 95% CI -1.35 to -0.17) and fewer cataracts (OR 0.84, 95% CI 0.73-0.97), but with better global cognitive functioning (g-factor; model 2; B = 0.07, 95% CI 0.04-0.10). CONCLUSIONS: PA is associated with multiple morbidities and better cognitive function, suggesting that systemic ageing and cognitive ageing are, to an extent, externally visible in the human face.


Subject(s)
Aging , Skin Aging , Aged , Middle Aged , Male , Humans , Female , Cohort Studies , Cross-Sectional Studies , Facies , Morbidity
20.
Eur J Hum Genet ; 31(3): 321-328, 2023 03.
Article in English | MEDLINE | ID: mdl-36336714

ABSTRACT

Genetic prediction of male pattern baldness (MPB) is important in science and society. Previous genetic MPB prediction models were limited by sparse marker coverage, small sample size, and/or data dependency in the different analytical steps. Here, we present novel models for genetic prediction of MPB based on a large set of markers and large independent subsample sets drawn among 187,435 European subjects. We selected 117 SNP predictors within 85 distinct loci from a list of 270 previously MPB-associated SNPs in 55,573 males of the UK Biobank Study (UKBB). Based on these 117 SNPs with and without age as additional predictor, we trained, by use of different methods, prediction models in a non-overlapping subset of 104,694 UKBB males and tested them in a non-overlapping subset of 26,177 UKBB males. Estimates of prediction accuracy were similar between methods with AUC ranges of 0.725-0.728 for severe, 0.631-0.635 for moderate, 0.598-0.602 for slight, and 0.708-0.711 for no hair loss with age, and slightly lower without, while prediction of any versus no hair loss gave 0.690-0.711 with age and slightly lower without. External validation in an early-onset enriched MPB dataset from the Bonn Study (N = 991) showed improved prediction accuracy without considering age such as AUC of 0.830 for no vs. any hair loss. Because of the large number of markers and the large independent datasets used for the different analytical steps, the newly presented genetic prediction models are the most reliable ones currently available for MPB or any other human appearance trait.


Subject(s)
Alopecia , Polymorphism, Single Nucleotide , Humans , Male , Phenotype , Alopecia/genetics
SELECTION OF CITATIONS
SEARCH DETAIL