Pesquisa | Portal de Pesquisa da BVS

Ten simple rules for using public biological data for your research.

Oza, Vishal H; Whitlock, Jordan H; Wilk, Elizabeth J; Uno-Antonison, Angelina; Wilk, Brandon; Gajapathy, Manavalan; Howton, Timothy C; Trull, Austyn; Ianov, Lara; Worthey, Elizabeth A; Lasseigne, Brittany N.

PLoS Comput Biol ; 19(1): e1010749, 2023 01.

Artigo em Inglês | MEDLINE | ID: mdl-36602970

RESUMO

With an increasing amount of biological data available publicly, there is a need for a guide on how to successfully download and use this data. The 10 simple rules for using public biological data are: (1) use public data purposefully in your research; (2) evaluate data for your use case; (3) check data reuse requirements and embargoes; (4) be aware of ethics for data reuse; (5) plan for data storage and compute requirements; (6) know what you are downloading; (7) download programmatically and verify integrity; (8) properly cite data; (9) make reprocessed data and models Findable, Accessible, Interoperable, and Reusable (FAIR) and share; and (10) make pipelines and code FAIR and share. These rules are intended as a guide for researchers wanting to make use of available data and to increase data reuse and reproducibility.

Assuntos

Armazenamento e Recuperação da Informação , Reprodutibilidade dos Testes

VarSight: prioritizing clinically reported variants with binary classification algorithms.

Holt, James M; Wilk, Brandon; Birch, Camille L; Brown, Donna M; Gajapathy, Manavalan; Moss, Alexander C; Sosonkina, Nadiya; Wilk, Melissa A; Anderson, Julie A; Harris, Jeremy M; Kelly, Jacob M; Shaterferdosian, Fariba; Uno-Antonison, Angelina E; Weborg, Arthur; Worthey, Elizabeth A.

BMC Bioinformatics ; 20(1): 496, 2019 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-31615419

RESUMO

BACKGROUND: When applying genomic medicine to a rare disease patient, the primary goal is to identify one or more genomic variants that may explain the patient's phenotypes. Typically, this is done through annotation, filtering, and then prioritization of variants for manual curation. However, prioritization of variants in rare disease patients remains a challenging task due to the high degree of variability in phenotype presentation and molecular source of disease. Thus, methods that can identify and/or prioritize variants to be clinically reported in the presence of such variability are of critical importance. METHODS: We tested the application of classification algorithms that ingest variant annotations along with phenotype information for predicting whether a variant will ultimately be clinically reported and returned to a patient. To test the classifiers, we performed a retrospective study on variants that were clinically reported to 237 patients in the Undiagnosed Diseases Network. RESULTS: We treated the classifiers as variant prioritization systems and compared them to four variant prioritization algorithms and two single-measure controls. We showed that the trained classifiers outperformed all other tested methods with the best classifiers ranking 72% of all reported variants and 94% of reported pathogenic variants in the top 20. CONCLUSIONS: We demonstrated how freely available binary classification algorithms can be used to prioritize variants even in the presence of real-world variability. Furthermore, these classifiers outperformed all other tested methods, suggesting that they may be well suited for working with real rare disease patient datasets.

Assuntos

Algoritmos , Doenças Genéticas Inatas/diagnóstico , Genômica/métodos , Mutação , Doenças Raras/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Genoma Humano , Humanos , Fenótipo , Polimorfismo Genético , Medicina de Precisão/métodos , Doenças Raras/genética , Estudos Retrospectivos , Análise de Sequência de DNA/métodos , Software

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA