Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
GigaByte ; 2023: 1-10, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37732134

RESUMO

We present ensemblQueryR, an R package for querying Ensembl linkage disequilibrium (LD) endpoints. This package is flexible, fast and user-friendly, and optimised for high-throughput querying. ensemblQueryR uses functions that are intuitive and amenable to custom code integration, familiar R object types as inputs and outputs as well as providing parallelisation functionality. For each Ensembl LD endpoint, ensemblQueryR provides two functions, permitting both single- and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate improved computational performance of ensemblQueryR over an exisiting tool in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through Docker and singularity images, making this tool widely accessible to the scientific community.

2.
GigaByte ; 2023: gigabyte87, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37637773

RESUMO

Amazon Simple Storage Service (Amazon S3) is a widely used platform for storing large biomedical datasets. Unintended data alterations can occur during data writing and transmission, altering the original content and generating unexpected results. However, no open-source and easy-to-use tool exists to verify end-to-end data integrity. Here, we present aws-s3-integrity-check, a user-friendly, lightweight, and reliable bash tool to verify the integrity of a dataset stored in an Amazon S3 bucket. Using this tool, we only needed ∼114 min to verify the integrity of 1,045 records ranging between 5 bytes and 10 gigabytes and occupying ∼935 gigabytes of the Amazon S3 cloud. Our aws-s3-integrity-check tool also provides file-by-file on-screen and log-file-based information about the status of each integrity check. To our knowledge, this tool is the only open-source one that allows verifying the integrity of a dataset uploaded to the Amazon S3 Storage quickly, reliably, and efficiently. The tool is freely available for download and use at https://github.com/SoniaRuiz/aws-s3-integrity-check and https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check.

3.
Brain ; 146(7): 2869-2884, 2023 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-36624280

RESUMO

Improvements in functional genomic annotation have led to a critical mass of neurogenetic discoveries. This is exemplified in hereditary ataxia, a heterogeneous group of disorders characterised by incoordination from cerebellar dysfunction. Associated pathogenic variants in more than 300 genes have been described, leading to a detailed genetic classification partitioned by age-of-onset. Despite these advances, up to 75% of patients with ataxia remain molecularly undiagnosed even following whole genome sequencing, as exemplified in the 100 000 Genomes Project. This study aimed to understand whether we can improve our knowledge of the genetic architecture of hereditary ataxia by leveraging functional genomic annotations, and as a result, generate insights and strategies that raise the diagnostic yield. To achieve these aims, we used publicly-available multi-omics data to generate 294 genic features, capturing information relating to a gene's structure, genetic variation, tissue-specific, cell-type-specific and temporal expression, as well as protein products of a gene. We studied these features across genes typically causing childhood-onset, adult-onset or both types of disease first individually, then collectively. This led to the generation of testable hypotheses which we investigated using whole genome sequencing data from up to 2182 individuals presenting with ataxia and 6658 non-neurological probands recruited in the 100 000 Genomes Project. Using this approach, we demonstrated a high short tandem repeat (STR) density within childhood-onset genes suggesting that we may be missing pathogenic repeat expansions within this cohort. This was verified in both childhood- and adult-onset ataxia patients from the 100 000 Genomes Project who were unexpectedly found to have a trend for higher repeat sizes even at naturally-occurring STRs within known ataxia genes, implying a role for STRs in pathogenesis. Using unsupervised analysis, we found significant similarities in genomic annotation across the gene panels, which suggested adult- and childhood-onset patients should be screened using a common diagnostic gene set. We tested this within the 100 000 Genomes Project by assessing the burden of pathogenic variants among childhood-onset genes in adult-onset patients and vice versa. This demonstrated a significantly higher burden of rare, potentially pathogenic variants in conventional childhood-onset genes among individuals with adult-onset ataxia. Our analysis has implications for the current clinical practice in genetic testing for hereditary ataxia. We suggest that the diagnostic rate for hereditary ataxia could be increased by removing the age-of-onset partition, and through a modified screening for repeat expansions in naturally-occurring STRs within known ataxia-associated genes, in effect treating these regions as candidate pathogenic loci.


Assuntos
Ataxia Cerebelar , Degenerações Espinocerebelares , Adulto , Humanos , Degenerações Espinocerebelares/genética , Ataxia Cerebelar/diagnóstico , Ataxia Cerebelar/genética , Ataxia/diagnóstico , Ataxia/genética , Genômica , Testes Genéticos
4.
Nucleic Acids Res ; 51(D1): D167-D178, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36399497

RESUMO

Dysregulation of RNA splicing contributes to both rare and complex diseases. RNA-sequencing data from human tissues has shown that this process can be inaccurate, resulting in the presence of novel introns detected at low frequency across samples and within an individual. To enable the full spectrum of intron use to be explored, we have developed IntroVerse, which offers an extensive catalogue on the splicing of 332,571 annotated introns and a linked set of 4,679,474 novel junctions covering 32,669 different genes. This dataset has been generated through the analysis of 17,510 human control RNA samples from 54 tissues provided by the Genotype-Tissue Expression Consortium. IntroVerse has two unique features: (i) it provides a complete catalogue of novel junctions and (ii) each novel junction has been assigned to a specific annotated intron. This unique, hierarchical structure offers multiple uses, including the identification of novel transcripts from known genes and their tissue-specific usage, and the assessment of background splicing noise for introns thought to be mis-spliced in disease states. IntroVerse provides a user-friendly web interface and is freely available at https://rytenlab.com/browser/app/introverse.


Assuntos
Bases de Dados Genéticas , Íntrons , Splicing de RNA , Humanos , Processamento Alternativo , Sequência de Bases , Íntrons/genética , RNA , Splicing de RNA/genética
5.
Commun Biol ; 4(1): 1262, 2021 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-34737414

RESUMO

Mitochondrial dysfunction contributes to the pathogenesis of many neurodegenerative diseases. The mitochondrial genome encodes core respiratory chain proteins, but the vast majority of mitochondrial proteins are nuclear-encoded, making interactions between the two genomes vital for cell function. Here, we examine these relationships by comparing mitochondrial and nuclear gene expression across different regions of the human brain in healthy and disease cohorts. We find strong regional patterns that are modulated by cell-type and reflect functional specialisation. Nuclear genes causally implicated in sporadic Parkinson's and Alzheimer's disease (AD) show much stronger relationships with the mitochondrial genome than expected by chance, and mitochondrial-nuclear relationships are highly perturbed in AD cases, particularly through synaptic and lysosomal pathways, potentially implicating the regulation of energy balance and removal of dysfunction mitochondria in the etiology or progression of the disease. Finally, we present MitoNuclearCOEXPlorer, a tool to interrogate key mitochondria-nuclear relationships in multi-dimensional brain data.


Assuntos
Encéfalo/fisiopatologia , Núcleo Celular/fisiologia , Mitocôndrias/fisiologia , Doenças Neurodegenerativas/fisiopatologia , Humanos , Análise de Sequência de RNA , Transdução de Sinais
6.
Nat Commun ; 12(1): 2076, 2021 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-33824317

RESUMO

Knowledge of genomic features specific to the human lineage may provide insights into brain-related diseases. We leverage high-depth whole genome sequencing data to generate a combined annotation identifying regions simultaneously depleted for genetic variation (constrained regions) and poorly conserved across primates. We propose that these constrained, non-conserved regions (CNCRs) have been subject to human-specific purifying selection and are enriched for brain-specific elements. We find that CNCRs are depleted from protein-coding genes but enriched within lncRNAs. We demonstrate that per-SNP heritability of a range of brain-relevant phenotypes are enriched within CNCRs. We find that genes implicated in neurological diseases have high CNCR density, including APOE, highlighting an unannotated intron-3 retention event. Using human brain RNA-sequencing data, we show the intron-3-retaining transcript to be more abundant in Alzheimer's disease with more severe tau and amyloid pathological burden. Thus, we demonstrate potential association of human-lineage-specific sequences in brain development and neurological disease.


Assuntos
Apolipoproteínas E/genética , Genoma Humano , Doenças Neurodegenerativas/genética , Filogenia , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Encéfalo/patologia , Cromossomos Humanos Par 19/genética , Sequência Conservada/genética , DNA Intergênico/genética , Ontologia Genética , Humanos , Íntrons/genética , Desequilíbrio de Ligação/genética , Anotação de Sequência Molecular , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Regressão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA