Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36688705

RESUMO

MOTIVATION: Advances in sequencing technologies have led to a surge in genomic data, although the functions of many gene products coded by these genes remain unknown. While in-depth, targeted experiments that determine the functions of these gene products are crucial and routinely performed, they fail to keep up with the inflow of novel genomic data. In an attempt to address this gap, high-throughput experiments are being conducted in which a large number of genes are investigated in a single study. The annotations generated as a result of these experiments are generally biased towards a small subset of less informative Gene Ontology (GO) terms. Identifying and removing biases from protein function annotation databases is important since biases impact our understanding of protein function by providing a poor picture of the annotation landscape. Additionally, as machine learning methods for predicting protein function are becoming increasingly prevalent, it is essential that they are trained on unbiased datasets. Therefore, it is not only crucial to be aware of biases, but also to judiciously remove them from annotation datasets. RESULTS: We introduce GOThresher, a Python tool that identifies and removes biases in function annotations from protein function annotation databases. AVAILABILITY AND IMPLEMENTATION: GOThresher is written in Python and released via PyPI https://pypi.org/project/gothresher/ and on the Bioconda Anaconda channel https://anaconda.org/bioconda/gothresher. The source code is hosted on GitHub https://github.com/FriedbergLab/GOThresher and distributed under the GPL 3.0 license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Genômica , Biologia Computacional/métodos , Anotação de Sequência Molecular , Software , Proteínas/genética , Proteínas/metabolismo , Bases de Dados de Proteínas
2.
Nucleic Acids Res ; 49(1): 67-78, 2021 01 11.
Artigo em Inglês | MEDLINE | ID: mdl-33305328

RESUMO

Gene-editing experiments commonly elicit the error-prone non-homologous end joining for DNA double-strand break (DSB) repair. Microhomology-mediated end joining (MMEJ) can generate more predictable outcomes for functional genomic and somatic therapeutic applications. We compared three DSB repair prediction algorithms - MENTHU, inDelphi, and Lindel - in identifying MMEJ-repaired, homogeneous genotypes (PreMAs) in an independent dataset of 5,885 distinct Cas9-mediated mouse embryonic stem cell DSB repair events. MENTHU correctly identified 46% of all PreMAs available, a ∼2- and ∼60-fold sensitivity increase compared to inDelphi and Lindel, respectively. In contrast, only Lindel correctly predicted predominant single-base insertions. We report the new algorithm MENdel, a combination of MENTHU and Lindel, that achieves the most predictive coverage of homogeneous out-of-frame mutations in this large dataset. We then estimated the frequency of Cas9-targetable homogeneous frameshift-inducing DSBs in vertebrate coding regions for gene discovery using MENdel. 47 out of 54 genes (87%) contained at least one early frameshift-inducing DSB and 49 out of 54 (91%) did so when also considering Cas12a-mediated deletions. We suggest that the use of MENdel helps researchers use MMEJ at scale for reverse genetics screenings and with sufficient intra-gene density rates to be viable for nearly all loss-of-function based gene editing therapeutic applications.


Assuntos
Algoritmos , Quebras de DNA de Cadeia Dupla , Reparo do DNA por Junção de Extremidades , Mutação da Fase de Leitura , Edição de Genes/métodos , Terapia Genética/métodos , Genômica/métodos , Mutação INDEL , Mutação com Perda de Função , Genética Reversa/métodos , Animais , Proteínas de Bactérias/metabolismo , Caspase 9/metabolismo , Conjuntos de Dados como Assunto , Células-Tronco Embrionárias/metabolismo , Humanos , Camundongos , Curva ROC , Streptococcus pyogenes/enzimologia , Peixe-Zebra/genética
3.
bioRxiv ; 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38559275

RESUMO

Epitope tagging is an invaluable technique enabling the identification, tracking, and purification of proteins in vivo. We developed a tool, EpicTope, to facilitate this method by identifying amino acid positions suitable for epitope insertion. Our method uses a scoring function that considers multiple protein sequence and structural features to determine locations least disruptive to the protein's function. We validated our approach on the zebrafish Smad5 protein, showing that multiple predicted internally tagged Smad5 proteins rescue zebrafish smad5 mutant embryos, while the N- and C-terminal tagged variants do not, also as predicted. We further show that the internally tagged Smad5 proteins are accessible to antibodies in wholemount zebrafish embryo immunohistochemistry and by western blot. Our work demonstrates that EpicTope is an accessible and effective tool for designing epitope tag insertion sites. EpicTope is available under a GPL-3 license from: https://github.com/FriedbergLab/Epictope.

4.
Bioinform Adv ; 4(1): vbae043, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38545087

RESUMO

We present CAFA-evaluator, a powerful Python program designed to evaluate the performance of prediction methods on targets with hierarchical concept dependencies. It generalizes multi-label evaluation to modern ontologies where the prediction targets are drawn from a directed acyclic graph and achieves high efficiency by leveraging matrix computation and topological sorting. The program requirements include a small number of standard Python libraries, making CAFA-evaluator easy to maintain. The code replicates the Critical Assessment of protein Function Annotation (CAFA) benchmarking, which evaluates predictions of the consistent subgraphs in Gene Ontology. Owing to its reliability and accuracy, the organizers have selected CAFA-evaluator as the official CAFA evaluation software. Availability and implementation: https://pypi.org/project/cafaeval.

5.
Gigascience ; 112022 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-35426911

RESUMO

BACKGROUND: Genome-wide gene function annotations are useful for hypothesis generation and for prioritizing candidate genes potentially responsible for phenotypes of interest. We functionally annotated the genes of 18 crop plant genomes across 14 species using the GOMAP pipeline. RESULTS: By comparison to existing GO annotation datasets, GOMAP-generated datasets cover more genes, contain more GO terms, and are similar in quality (based on precision and recall metrics using existing gold standards as the basis for comparison). From there, we sought to determine whether the datasets across multiple species could be used together to carry out comparative functional genomics analyses in plants. To test the idea and as a proof of concept, we created dendrograms of functional relatedness based on terms assigned for all 18 genomes. These dendrograms were compared to well-established species-level evolutionary phylogenies to determine whether trees derived were in agreement with known evolutionary relationships, which they largely are. Where discrepancies were observed, we determined branch support based on jackknifing then removed individual annotation sets by genome to identify the annotation sets causing unexpected relationships. CONCLUSIONS: GOMAP-derived functional annotations used together across multiple species generally retain sufficient biological signal to recover known phylogenetic relationships based on genome-wide functional similarities, indicating that comparative functional genomics across species based on GO data holds promise for generating novel hypotheses about comparative gene function and traits.


Assuntos
Genoma de Planta , Genômica , Bases de Dados Genéticas , Ontologia Genética , Anotação de Sequência Molecular , Filogenia , Plantas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA