Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
PLoS Genet ; 19(9): e1010932, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37721944

RESUMO

The eQTL Catalogue is an open database of uniformly processed human molecular quantitative trait loci (QTLs). We are continuously updating the resource to further increase its utility for interpreting genetic associations with complex traits. Over the past two years, we have increased the number of uniformly processed studies from 21 to 31 and added X chromosome QTLs for 19 compatible studies. We have also implemented Leafcutter to directly identify splice-junction usage QTLs in all RNA sequencing datasets. Finally, to improve the interpretability of transcript-level QTLs, we have developed static QTL coverage plots that visualise the association between the genotype and average RNA sequencing read coverage in the region for all 1.7 million fine mapped associations. To illustrate the utility of these updates to the eQTL Catalogue, we performed colocalisation analysis between vitamin D levels in the UK Biobank and all molecular QTLs in the eQTL Catalogue. Although most GWAS loci colocalised both with eQTLs and transcript-level QTLs, we found that visual inspection could sometimes be used to distinguish primary splicing QTLs from those that appear to be secondary consequences of large-effect gene expression QTLs. While these visually confirmed primary splicing QTLs explain just 6/53 of the colocalising signals, they are significantly less pleiotropic than eQTLs and identify a prioritised causal gene in 4/6 cases.


Assuntos
Herança Multifatorial , Locos de Características Quantitativas , Humanos , Locos de Características Quantitativas/genética , Genótipo , Sequência de Bases , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único
2.
Nucleic Acids Res ; 51(W1): W207-W212, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37144459

RESUMO

g:Profiler is a reliable and up-to-date functional enrichment analysis tool that supports various evidence types, identifier types and organisms. The toolset integrates many databases, including Gene Ontology, KEGG and TRANSFAC, to provide a comprehensive and in-depth analysis of gene lists. It also provides interactive and intuitive user interfaces and supports ordered queries and custom statistical backgrounds, among other settings. g:Profiler provides multiple programmatic interfaces to access its functionality. These can be easily integrated into custom workflows and external tools, making them valuable resources for researchers who want to develop their own solutions. g:Profiler has been available since 2007 and is used to analyse millions of queries. Research reproducibility and transparency are achieved by maintaining working versions of all past database releases since 2015. g:Profiler supports 849 species, including vertebrates, plants, fungi, insects and parasites, and can analyse any organism through user-uploaded custom annotation files. In this update article, we introduce a novel filtering method highlighting Gene Ontology driver terms, accompanied by new graph visualizations providing a broader context for significant Gene Ontology terms. As a leading enrichment analysis and gene list interoperability service, g:Profiler offers a valuable resource for genetics, biology and medical researchers. It is freely accessible at https://biit.cs.ut.ee/gprofiler.


Assuntos
Mapeamento Cromossômico , Biologia Computacional , Genes , Software , Animais , Mapeamento Cromossômico/instrumentação , Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Internet , Reprodutibilidade dos Testes , Interface Usuário-Computador , Biologia Computacional/instrumentação , Biologia Computacional/métodos , Genes/genética , Humanos
3.
Nucleic Acids Res ; 47(W1): W191-W198, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31066453

RESUMO

Biological data analysis often deals with lists of genes arising from various studies. The g:Profiler toolset is widely used for finding biological categories enriched in gene lists, conversions between gene identifiers and mappings to their orthologs. The mission of g:Profiler is to provide a reliable service based on up-to-date high quality data in a convenient manner across many evidence types, identifier spaces and organisms. g:Profiler relies on Ensembl as a primary data source and follows their quarterly release cycle while updating the other data sources simultaneously. The current update provides a better user experience due to a modern responsive web interface, standardised API and libraries. The results are delivered through an interactive and configurable web design. Results can be downloaded as publication ready visualisations or delimited text files. In the current update we have extended the support to 467 species and strains, including vertebrates, plants, fungi, insects and parasites. By supporting user uploaded custom GMT files, g:Profiler is now capable of analysing data from any organism. All past releases are maintained for reproducibility and transparency. The 2019 update introduces an extensive technical rewrite making the services faster and more flexible. g:Profiler is freely available at https://biit.cs.ut.ee/gprofiler.


Assuntos
Bases de Dados Genéticas , Genoma , Armazenamento e Recuperação da Informação , Software , Animais , Fungos/genética , Humanos , Parasitos/genética , Plantas/genética
4.
PLoS One ; 19(5): e0303176, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38728305

RESUMO

BACKGROUND: The COVID-19 pandemic was characterised by rapid waves of disease, carried by the emergence of new and more infectious SARS-CoV-2 virus variants. How the pandemic unfolded in various locations during its first two years has yet to be sufficiently covered. To this end, here we are looking at the circulating SARS-CoV-2 variants, their diversity, and hospitalisation rates in Estonia in the period from March 2000 to March 2022. METHODS: We sequenced a total of 27,550 SARS-CoV-2 samples in Estonia between March 2020 and March 2022. High-quality sequences were genotyped and assigned to Nextstrain clades and Pango lineages. We used regression analysis to determine the dynamics of lineage diversity and the probability of clade-specific hospitalisation stratified by age and sex. RESULTS: We successfully sequenced a total of 25,375 SARS-CoV-2 genomes (or 92%), identifying 19 Nextstrain clades and 199 Pango lineages. In 2020 the most prevalent clades were 20B and 20A. The various subsequent waves of infection were driven by 20I (Alpha), 21J (Delta) and Omicron clades 21K and 21L. Lineage diversity via the Shannon index was at its highest during the Delta wave. About 3% of sequenced SARS-CoV-2 samples came from hospitalised individuals. Hospitalisation increased markedly with age in the over-forties, and was negligible in the under-forties. Vaccination decreased the odds of hospitalisation in over-forties. The effect of vaccination on hospitalisation rates was strongly dependent upon age but was clade-independent. People who were infected with Omicron clades had a lower hospitalisation likelihood in age groups of forty and over than was the case with pre-Omicron clades regardless of vaccination status. CONCLUSIONS: COVID-19 disease waves in Estonia were driven by the Alpha, Delta, and Omicron clades. Omicron clades were associated with a substantially lower hospitalisation probability than pre-Omicron clades. The protective effect of vaccination in reducing hospitalisation likelihood was independent of the involved clade.


Assuntos
COVID-19 , Hospitalização , SARS-CoV-2 , Humanos , COVID-19/epidemiologia , COVID-19/virologia , Hospitalização/estatística & dados numéricos , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , SARS-CoV-2/classificação , Masculino , Feminino , Pessoa de Meia-Idade , Adulto , Idoso , Estônia/epidemiologia , Genoma Viral , Adulto Jovem , Filogenia , Pandemias , Adolescente , Criança , Lactente , Pré-Escolar , Idoso de 80 Anos ou mais
5.
Nat Biotechnol ; 41(10): 1446-1456, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36797492

RESUMO

Most short sequences can be precisely written into a selected genomic target using prime editing; however, it remains unclear what factors govern insertion. We design a library of 3,604 sequences of various lengths and measure the frequency of their insertion into four genomic sites in three human cell lines, using different prime editor systems in varying DNA repair contexts. We find that length, nucleotide composition and secondary structure of the insertion sequence all affect insertion rates. We also discover that the 3' flap nucleases TREX1 and TREX2 suppress the insertion of longer sequences. Combining the sequence and repair features into a machine learning model, we can predict relative frequency of insertions into a site with R = 0.70. Finally, we demonstrate how our accurate prediction and user-friendly software help choose codon variants of common fusion tags that insert at high efficiency, and provide a catalog of empirically determined insertion rates for over a hundred useful sequences.


Assuntos
Reparo do DNA , Elementos de DNA Transponíveis , Humanos , Reparo do DNA/genética , Edição de Genes , Sistemas CRISPR-Cas
6.
bioRxiv ; 2023 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-37425722

RESUMO

The genome engineering capability of the CRISPR/Cas system depends on the DNA repair machinery to generate the final outcome. Several genes can have an impact on mutations created, but their exact function and contribution to the result of the repair are not completely characterised. This lack of knowledge has limited the ability to comprehend and regulate the editing outcomes. Here, we measure how the absence of 21 repair genes changes the mutation outcomes of Cas9-generated cuts at 2,812 synthetic target sequences in mouse embryonic stem cells. Absence of key non-homologous end joining genes Lig4, Xrcc4, and Xlf abolished small insertions and deletions, while disabling key microhomology-mediated repair genes Nbn and Polq reduced frequency of longer deletions. Complex alleles of combined insertion and deletions were preferentially generated in the absence of Xrcc6. We further discover finer structure in the outcome frequency changes for single nucleotide insertions and deletions between large microhomologies that are differentially modulated by the knockouts. We use the knowledge of the reproducible variation across repair milieus to build predictive models of Cas9 editing results that outperform the current standards. This work improves our understanding of DNA repair gene function, and provides avenues for more precise modulation of CRISPR/Cas9-generated mutations.

7.
bioRxiv ; 2023 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-37066341

RESUMO

Splicing quantitative trait loci (QTLs) have been implicated as a common mechanism underlying complex trait associations. However, utilising splicing QTLs in target discovery and prioritisation has been challenging due to extensive data normalisation which often renders the direction of the genetic effect as well as its magnitude difficult to interpret. This is further complicated by the fact that strong expression QTLs often manifest as weak splicing QTLs and vice versa, making it difficult to uniquely identify the underlying molecular mechanism at each locus. We find that these ambiguities can be mitigated by visualising the association between the genotype and average RNA sequencing read coverage in the region. Here, we generate these QTL coverage plots for 1.7 million molecular QTL associations in the eQTL Catalogue identified with five quantification methods. We illustrate the utility of these QTL coverage plots by performing colocalisation between vitamin D levels in the UK Biobank and all molecular QTLs in the eQTL Catalogue. We find that while visually confirmed splicing QTLs explain just 6/53 of the colocalising signals, they are significantly less pleiotropic than eQTLs and identify a prioritised causal gene in 4/6 cases. All our association summary statistics and QTL coverage plots are freely available at https://www.ebi.ac.uk/eqtl/.

8.
F1000Res ; 92020.
Artigo em Inglês | MEDLINE | ID: mdl-33564394

RESUMO

g:Profiler ( https://biit.cs.ut.ee/gprofiler) is a widely used gene list functional profiling and namespace conversion toolset that has been contributing to reproducible biological data analysis already since 2007. Here we introduce the accompanying R package, gprofiler2, developed to facilitate programmatic access to g:Profiler computations and databases via REST API. The gprofiler2 package provides an easy-to-use functionality that enables researchers to incorporate functional enrichment analysis into automated analysis pipelines written in R. The package also implements interactive visualisation methods to help to interpret the enrichment results and to illustrate them for publications. In addition, gprofiler2 gives access to the versatile gene/protein identifier conversion functionality in g:Profiler enabling to map between hundreds of different identifier types or orthologous species. The gprofiler2 package is freely available at the CRAN repository.


Assuntos
Biologia Computacional , Perfilação da Expressão Gênica , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA