Your browser doesn't support javascript.
loading
FANCY: fast estimation of privacy risk in functional genomics data.
Gürsoy, Gamze; Brannon, Charlotte M; Navarro, Fabio C P; Gerstein, Mark.
Afiliação
  • Gürsoy G; Computational Biology and Bioinformatics, New Haven, CT 06520, USA.
  • Brannon CM; Molecular Biophysics and Biochemistry, New Haven, CT 06520, USA.
  • Navarro FCP; Computational Biology and Bioinformatics, New Haven, CT 06520, USA.
  • Gerstein M; Molecular Biophysics and Biochemistry, New Haven, CT 06520, USA.
Bioinformatics ; 36(21): 5145-5150, 2021 01 29.
Article em En | MEDLINE | ID: mdl-32726397
ABSTRACT
MOTIVATION Functional genomics data are becoming clinically actionable, raising privacy concerns. However, quantifying privacy leakage via genotyping is difficult due to the heterogeneous nature of sequencing techniques. Thus, we present FANCY, a tool that rapidly estimates the number of leaking variants from raw RNA-Seq, ATAC-Seq and ChIP-Seq reads, without explicit genotyping. FANCY employs supervised regression using overall sequencing statistics as features and provides an estimate of the overall privacy risk before data release.

RESULTS:

FANCY can predict the cumulative number of leaking SNVs with an average 0.95 R2 for all independent test sets. We realize the importance of accurate prediction when the number of leaked variants is low. Thus, we develop a special version of the model, which can make predictions with higher accuracy when the number of leaking variants is low. AVAILABILITY AND IMPLEMENTATION A python and MATLAB implementation of FANCY, as well as custom scripts to generate the features can be found at https//github.com/gersteinlab/FANCY. We also provide jupyter notebooks so that users can optimize the parameters in the regression model based on their own data. An easy-to-use webserver that takes inputs and displays results can be found at fancy.gersteinlab.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Privacidade Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Privacidade Idioma: En Ano de publicação: 2021 Tipo de documento: Article