Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Complex Psychiatry ; 8(1-2): 35-46, 2022 Sep.
Article in English | MEDLINE | ID: mdl-36407771

ABSTRACT

Introduction: Genome-wide association studies (GWAS) have played a critical role in identifying many thousands of loci associated with complex phenotypes and diseases. This has led to several translations of novel disease susceptibility genes into drug targets and care. This however has not been the case for analyses where sample sizes are small, which suffer from multiple comparisons testing. The present study examined the statistical impact of combining a burden test methodology, PrediXcan, with a multimodel meta-analysis, cross phenotype association (CPASSOC). Methods: The analysis was conducted on 5 addiction traits: family alcoholism, cannabis craving, alcohol, nicotine, and cannabis dependence and 10 brain tissues: anterior cingulate cortex BA24, cerebellar hemisphere, cortex, hippocampus, nucleus accumbens basal ganglia, caudate basal ganglia, cerebellum, frontal cortex BA9, hypothalamus, and putamen basal ganglia. Our sample consisted of 1,640 participants from the University of California, San Francisco (UCSF) Family Alcoholism Study. Genotypes were obtained through low pass whole genome sequencing and the use of Thunder, a linkage disequilibrium variant caller. Results: The post-PrediXcan, gene-phenotype association without aggregation resulted in 2 significant results, HCG27 and SPPL2B. Aggregating across phenotypes resulted no significant findings. Aggregating across tissues resulted in 15 significant and 5 suggestive associations: PPIE, RPL36AL, FOXN2, MTERF4, SEPTIN2, CIAO3, RPL36AL, ZNF304, CCDC66, SSPOP, SLC7A9, LY75, MTRF1L, COA5, and RRP7A; RPS23, GNMT, ERV3-1, APIP, and HLA-B, respectively. Discussion: Given the relatively small size of the cohort, this multimodel approach was able to find over a dozen significant associations between predicted gene expression and addiction traits. Of our findings, 8 had prior associations with similar phenotypes through investigation of the GWAS Atlas. With the onset of improved transcriptome data, this approach should increase in efficacy.

2.
Sci Rep ; 12(1): 5440, 2022 03 31.
Article in English | MEDLINE | ID: mdl-35361850

ABSTRACT

Regularized regression analysis is a mature analytic approach to identify weighted sums of variables predicting outcomes. We present a novel Coarse Approximation Linear Function (CALF) to frugally select important predictors and build simple but powerful predictive models. CALF is a linear regression strategy applied to normalized data that uses nonzero weights + 1 or - 1. Qualitative (linearly invariant) metrics to be optimized can be (for binary response) Welch (Student) t-test p-value or area under curve (AUC) of receiver operating characteristic, or (for real response) Pearson correlation. Predictor weighting is critically important when developing risk prediction models. While counterintuitive, it is a fact that qualitative metrics can favor CALF with ± 1 weights over algorithms producing real number weights. Moreover, while regression methods may be expected to change most or all weight values upon even small changes in input data (e.g., discarding a single subject of hundreds) CALF weights generally do not so change. Similarly, some regression methods applied to collinear or nearly collinear variables yield unpredictable magnitude or the direction (in p-space) of the weights as a vector. In contrast, with CALF if some predictors are linearly dependent or nearly so, CALF simply chooses at most one (the most informative, if any) and ignores the others, thus avoiding the inclusion of two or more collinear variables in the model.


Subject(s)
Algorithms , Area Under Curve , Humans , Linear Models , ROC Curve
3.
BMC Bioinformatics ; 22(1): 374, 2021 Jul 20.
Article in English | MEDLINE | ID: mdl-34284719

ABSTRACT

BACKGROUND: As exome sequencing (ES) integrates into clinical practice, we should make every effort to utilize all information generated. Copy-number variation can lead to Mendelian disorders, but small copy-number variants (CNVs) often get overlooked or obscured by under-powered data collection. Many groups have developed methodology for detecting CNVs from ES, but existing methods often perform poorly for small CNVs and rely on large numbers of samples not always available to clinical laboratories. Furthermore, methods often rely on Bayesian approaches requiring user-defined priors in the setting of insufficient prior knowledge. This report first demonstrates the benefit of multiplexed exome capture (pooling samples prior to capture), then presents a novel detection algorithm, mcCNV ("multiplexed capture CNV"), built around multiplexed capture. RESULTS: We demonstrate: (1) multiplexed capture reduces inter-sample variance; (2) our mcCNV method, a novel depth-based algorithm for detecting CNVs from multiplexed capture ES data, improves the detection of small CNVs. We contrast our novel approach, agnostic to prior information, with the the commonly-used ExomeDepth. In a simulation study mcCNV demonstrated a favorable false discovery rate (FDR). When compared to calls made from matched genome sequencing, we find the mcCNV algorithm performs comparably to ExomeDepth. CONCLUSION: Implementing multiplexed capture increases power to detect single-exon CNVs. The novel mcCNV algorithm may provide a more favorable FDR than ExomeDepth. The greatest benefits of our approach derive from (1) not requiring a database of reference samples and (2) not requiring prior information about the prevalance or size of variants.


Subject(s)
DNA Copy Number Variations , Exome , Algorithms , Bayes Theorem , Exome/genetics , High-Throughput Nucleotide Sequencing , Exome Sequencing
SELECTION OF CITATIONS
SEARCH DETAIL
...