Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 32(4): 738-749, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35256454

RESUMEN

The Human Reference Genome serves as the foundation for modern genomic analyses. However, in its present form, it does not adequately represent the vast genetic diversity of the human population. In this study, we explored the consensus genome as a potential successor of the current reference genome and assessed its effect on the accuracy of RNA-seq read alignment. To find the best haploid genome representation, we constructed consensus genomes at the pan-human, superpopulation, and population levels, using variant information from The 1000 Genomes Project Consortium. Using personal haploid genomes as the ground truth, we compared mapping errors for real RNA-seq reads aligned to the consensus genomes versus the reference genome. For reads overlapping homozygous variants, we found that the mapping error decreased by a factor of approximately two to three when the reference was replaced with the pan-human consensus genome. We also found that using more population-specific consensuses resulted in little to no increase over using the pan-human consensus, suggesting a limit in the utility of incorporating a more specific genomic variation. Replacing the reference with consensus genomes impacts functional analyses, such as differential expressions of isoforms, genes, and splice junctions.


Asunto(s)
Genoma Humano , Genómica , Consenso , Genómica/métodos , Humanos , RNA-Seq , Secuenciación del Exoma
2.
Chem Senses ; 482023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-37539767

RESUMEN

The sweet taste receptor (STR) is a G protein-coupled receptor (GPCR) responsible for mediating cellular responses to sweet stimuli. Early evidence suggests that elements of the STR signaling system are present beyond the tongue in metabolically active tissues, where it may act as an extraoral glucose sensor. This study aimed to delineate expression of the STR in extraoral tissues using publicly available RNA-sequencing repositories. Gene expression data was mined for all genes implicated in the structure and function of the STR, and control genes including highly expressed metabolic genes in relevant tissues, other GPCRs and effector G proteins with physiological roles in metabolism, and other GPCRs with expression exclusively outside the metabolic tissues. Since the physiological role of the STR in extraoral tissues is likely related to glucose sensing, expression was then examined in diseases related to glucose-sensing impairment such as type 2 diabetes. An aggregate co-expression network was then generated to precisely determine co-expression patterns among the STR genes in these tissues. We found that STR gene expression was negligible in human pancreatic and adipose tissues, and low in intestinal tissue. Genes encoding the STR did not show significant co-expression or connectivity with other functional genes in these tissues. In addition, STR expression was higher in mouse pancreatic and adipose tissues, and equivalent to human in intestinal tissue. Our results suggest that STR expression in mice is not representative of expression in humans, and the receptor is unlikely to be a promising extraoral target in human cardiometabolic disease.


Asunto(s)
Enfermedades Cardiovasculares , Diabetes Mellitus Tipo 2 , Papilas Gustativas , Ratones , Humanos , Animales , Gusto/fisiología , Diabetes Mellitus Tipo 2/genética , Papilas Gustativas/metabolismo , Receptores Acoplados a Proteínas G/metabolismo , Perfilación de la Expresión Génica , Glucosa/metabolismo , Enfermedades Cardiovasculares/metabolismo
3.
Mol Cell Proteomics ; 19(11): 1876-1895, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32817346

RESUMEN

Co-fractionation MS (CF-MS) is a technique with potential to characterize endogenous and unmanipulated protein complexes on an unprecedented scale. However this potential has been offset by a lack of guidelines for best-practice CF-MS data collection and analysis. To obtain such guidelines, this study thoroughly evaluates novel and published Saccharomyces cerevisiae CF-MS data sets using very high proteome coverage libraries of yeast gold standard complexes. A new method for identifying gold standard complexes in CF-MS data, Reference Complex Profiling, and the Extending 'Guilt-by-Association' by Degree (EGAD) R package are used for these evaluations, which are verified with concurrent analyses of published human data. By evaluating data collection designs, which involve fractionation of cell lysates, it is found that near-maximum recall of complexes can be achieved with fewer samples than published studies. Distributing sample collection across orthogonal fractionation methods, rather than a single high resolution data set, leads to particularly efficient recall. By evaluating 17 different similarity scoring metrics, which are central to CF-MS data analysis, it is found that two metrics rarely used in past CF-MS studies - Spearman and Kendall correlations - and the recently introduced Co-apex metric frequently maximize recall, whereas a popular metric-Euclidean distance-delivers poor recall. The common practice of integrating external genomic data into CF-MS data analysis is also evaluated, revealing that this practice may improve the precision and recall of known complexes but is generally unsuitable for predicting novel complexes in model organisms. If studying nonmodel organisms using orthologous genomic data, it is found that particular subsets of fractionation profiles (e.g. the lowest abundance quartile) should be excluded to minimize false discovery. These assessments are summarized in a series of universally applicable guidelines for precise, sensitive and efficient CF-MS studies of known complexes, and effective predictions of novel complexes for orthogonal experimental validation.


Asunto(s)
Fraccionamiento Químico/métodos , Espectrometría de Masas/métodos , Proteoma/metabolismo , Proteómica/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Cromatografía en Gel , Cromatografía Liquida/métodos , Ontología de Genes , Humanos , Estándares de Referencia
4.
Nucleic Acids Res ; 48(W1): W566-W571, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32392296

RESUMEN

Co-expression analysis has provided insight into gene function in organisms from Arabidopsis to zebrafish. Comparison across species has the potential to enrich these results, for example by prioritizing among candidate human disease genes based on their network properties or by finding alternative model systems where their co-expression is conserved. Here, we present CoCoCoNet as a tool for identifying conserved gene modules and comparing co-expression networks. CoCoCoNet is a resource for both data and methods, providing gold standard networks and sophisticated tools for on-the-fly comparative analyses across 14 species. We show how CoCoCoNet can be used in two use cases. In the first, we demonstrate deep conservation of a nucleolus gene module across very divergent organisms, and in the second, we show how the heterogeneity of autism mechanisms in humans can be broken down by functional groups and translated to model organisms. CoCoCoNet is free to use and available to all at https://milton.cshl.edu/CoCoCoNet, with data and R scripts available at ftp://milton.cshl.edu/data.


Asunto(s)
Redes Reguladoras de Genes , Programas Informáticos , Animales , Trastorno del Espectro Autista/genética , Expresión Génica , Humanos , RNA-Seq , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
5.
Proc Natl Acad Sci U S A ; 116(13): 6491-6500, 2019 03 26.
Artículo en Inglés | MEDLINE | ID: mdl-30846554

RESUMEN

Differential expression (DE) is commonly used to explore molecular mechanisms of biological conditions. While many studies report significant results between their groups of interest, the degree to which results are specific to the question at hand is not generally assessed, potentially leading to inaccurate interpretation. This could be particularly problematic for metaanalysis where replicability across datasets is taken as strong evidence for the existence of a specific, biologically relevant signal, but which instead may arise from recurrence of generic processes. To address this, we developed an approach to predict DE based on an analysis of over 600 studies. A predictor based on empirical prior probability of DE performs very well at this task (mean area under the receiver operating characteristic curve, ∼0.8), indicating that a large fraction of DE hit lists are nonspecific. In contrast, predictors based on attributes such as gene function, mutation rates, or network features perform poorly. Genes associated with sex, the extracellular matrix, the immune system, and stress responses are prominent within the "DE prior." In a series of control studies, we show that these patterns reflect shared biology rather than technical artifacts or ascertainment biases. Finally, we demonstrate the application of the DE prior to data interpretation in three use cases: (i) breast cancer subtyping, (ii) single-cell genomics of pancreatic islet cells, and (iii) metaanalysis of lung adenocarcinoma and renal transplant rejection transcriptomics. In all cases, we find hallmarks of generic DE, highlighting the need for nuanced interpretation of gene phenotypic associations.


Asunto(s)
Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Genética Humana , Probabilidad , Adenocarcinoma/genética , Biomarcadores de Tumor/genética , Neoplasias de la Mama/genética , Procesamiento Automatizado de Datos , Femenino , Redes Reguladoras de Genes , Genes Esenciales , Genómica , Rechazo de Injerto , Humanos , Trasplante de Riñón , Neoplasias Pulmonares , Curva ROC , Recurrencia , Sensibilidad y Especificidad , Transcriptoma
6.
Nucleic Acids Res ; 46(10): 5125-5138, 2018 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-29718481

RESUMEN

Many tools are available for RNA-seq alignment and expression quantification, with comparative value being hard to establish. Benchmarking assessments often highlight methods' good performance, but are focused on either model data or fail to explain variation in performance. This leaves us to ask, what is the most meaningful way to assess different alignment choices? And importantly, where is there room for progress? In this work, we explore the answers to these two questions by performing an exhaustive assessment of the STAR aligner. We assess STAR's performance across a range of alignment parameters using common metrics, and then on biologically focused tasks. We find technical metrics such as fraction mapping or expression profile correlation to be uninformative, capturing properties unlikely to have any role in biological discovery. Surprisingly, we find that changes in alignment parameters within a wide range have little impact on both technical and biological performance. Yet, when performance finally does break, it happens in difficult regions, such as X-Y paralogs and MHC genes. We believe improved reporting by developers will help establish where results are likely to be robust or fragile, providing a better baseline to establish where methodological progress can still occur.


Asunto(s)
Expresión Génica , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Algoritmos , Cromosomas Humanos Y , Bases de Datos Genéticas , Femenino , Humanos , Masculino , Factores Sexuales
7.
Nucleic Acids Res ; 45(4): e20, 2017 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-28204549

RESUMEN

Gene set analysis, which translates gene lists into enriched functions, is among the most common bioinformatic methods. Yet few would advocate taking the results at face value. Not only is there no agreement on the algorithms themselves, there is no agreement on how to benchmark them. In this paper, we evaluate the robustness and uniqueness of enrichment results as a means of assessing methods even where correctness is unknown. We show that heavily annotated ('multifunctional') genes are likely to appear in genomics study results and drive the generation of biologically non-specific enrichment results as well as highly fragile significances. By providing a means of determining where enrichment analyses report non-specific and non-robust findings, we are able to assess where we can be confident in their use. We find significant progress in recent bias correction methods for enrichment and provide our own software implementation. Our approach can be readily adapted to any pre-existing package.


Asunto(s)
Genes , Genómica/métodos , Algoritmos , Animales , Trastorno Autístico/genética , Hipoxia de la Célula/genética , Expresión Génica , Ontología de Genes , Estudio de Asociación del Genoma Completo , Humanos , Ratones , Anotación de Secuencia Molecular , Factor 3 de Transcripción de Unión a Octámeros/metabolismo , Esquizofrenia/genética , Programas Informáticos
8.
Bioinformatics ; 33(4): 612-614, 2017 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-27993773

RESUMEN

Summary: Evaluating gene networks with respect to known biology is a common task but often a computationally costly one. Many computational experiments are difficult to apply exhaustively in network analysis due to run-times. To permit high-throughput analysis of gene networks, we have implemented a set of very efficient tools to calculate functional properties in networks based on guilt-by-association methods. ( xtending ' uilt-by- ssociation' by egree) allows gene networks to be evaluated with respect to hundreds or thousands of gene sets. The methods predict novel members of gene groups, assess how well a gene network groups known sets of genes, and determines the degree to which generic predictions drive performance. By allowing fast evaluations, whether of random sets or real functional ones, provides the user with an assessment of performance which can easily be used in controlled evaluations across many parameters. Availability and Implementation: The software package is freely available at https://github.com/sarbal/EGAD and implemented for use in R and Matlab. The package is also freely available under the LGPL license from the Bioconductor web site ( http://bioconductor.org ). Contact: JGillis@cshl.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Programas Informáticos , Animales , Humanos , Saccharomyces cerevisiae/genética
9.
PLoS Comput Biol ; 12(4): e1004868, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-27082953

RESUMEN

In addition to detecting novel transcripts and higher dynamic range, a principal claim for RNA-sequencing has been greater replicability, typically measured in sample-sample correlations of gene expression levels. Through a re-analysis of ENCODE data, we show that replicability of transcript abundances will provide misleading estimates of the replicability of conditional variation in transcript abundances (i.e., most expression experiments). Heuristics which implicitly address this problem have emerged in quality control measures to obtain 'good' differential expression results. However, these methods involve strict filters such as discarding low expressing genes or using technical replicates to remove discordant transcripts, and are costly or simply ad hoc. As an alternative, we model gene-level replicability of differential activity using co-expressing genes. We find that sets of housekeeping interactions provide a sensitive means of estimating the replicability of expression changes, where the co-expressing pair can be regarded as pseudo-replicates of one another. We model the effects of noise that perturbs a gene's expression within its usual distribution of values and show that perturbing expression by only 5% within that range is readily detectable (AUROC~0.73). We have made our method available as a set of easily implemented R scripts.


Asunto(s)
Análisis de Secuencia de ARN/estadística & datos numéricos , Biología Computacional , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Expresión Génica , Humanos , Modelos Estadísticos , Control de Calidad , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/normas , Relación Señal-Ruido
10.
Nat Commun ; 15(1): 3315, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38632311

RESUMEN

This study investigates the humoral and cellular immune responses and health-related quality of life measures in individuals with mild to moderate long COVID (LC) compared to age and gender matched recovered COVID-19 controls (MC) over 24 months. LC participants show elevated nucleocapsid IgG levels at 3 months, and higher neutralizing capacity up to 8 months post-infection. Increased spike-specific and nucleocapsid-specific CD4+ T cells, PD-1, and TIM-3 expression on CD4+ and CD8+ T cells were observed at 3 and 8 months, but these differences do not persist at 24 months. Some LC participants had detectable IFN-γ and IFN-ß, that was attributed to reinfection and antigen re-exposure. Single-cell RNA sequencing at the 24 month timepoint shows similar immune cell proportions and reconstitution of naïve T and B cell subsets in LC and MC. No significant differences in exhaustion scores or antigen-specific T cell clones are observed. These findings suggest resolution of immune activation in LC and return to comparable immune responses between LC and MC over time. Improvement in self-reported health-related quality of life at 24 months was also evident in the majority of LC (62%). PTX3, CRP levels and platelet count are associated with improvements in health-related quality of life.


Asunto(s)
COVID-19 , Síndrome Post Agudo de COVID-19 , Humanos , Linfocitos T CD8-positivos , Calidad de Vida , SARS-CoV-2 , Anticuerpos Antivirales
11.
BMC Bioinformatics ; 14: 249, 2013 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-23947436

RESUMEN

BACKGROUND: Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required. DESCRIPTION: Gentrepid is a web resource which predicts and prioritizes candidate disease genes for both Mendelian and complex diseases. The system can take input from linkage analysis of single genetic intervals or multiple marker loci from genome-wide association studies. The underlying database of the Gentrepid tool sources data from numerous gene and protein resources, taking advantage of the wealth of biological information available. Using known disease gene information from OMIM, the system predicts and prioritizes disease gene candidates that participate in the same protein pathways or share similar protein domains. Alternatively, using an ab initio approach, the system can detect enrichment of these protein annotations without prior knowledge of the phenotype. CONCLUSIONS: The system aims to integrate the wealth of protein information currently available with known and novel phenotype/genotype information to acquire knowledge of biological mechanisms underpinning disease. We have updated the system to facilitate analysis of GWAS data and the study of complex diseases. Application of the system to GWAS data on hypertension using the ICBP data is provided as an example. An interesting prediction is a ZIP transporter additional to the one found by the ICBP analysis. The webserver URL is https://www.gentrepid.org/.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Internet , Humanos , Fenotipo
12.
Nat Commun ; 14(1): 7226, 2023 11 09.
Artículo en Inglés | MEDLINE | ID: mdl-37940702

RESUMEN

Genetic and environmental variation are key contributors during organism development, but the influence of minor perturbations or noise is difficult to assess. This study focuses on the stochastic variation in allele-specific expression that persists through cell divisions in the nine-banded armadillo (Dasypus novemcinctus). We investigated the blood transcriptome of five wild monozygotic quadruplets over time to explore the influence of developmental stochasticity on gene expression. We identify an enduring signal of autosomal allelic variability that distinguishes individuals within a quadruplet despite their genetic similarity. This stochastic allelic variation, akin to X-inactivation but broader, provides insight into non-genetic influences on phenotype. The presence of stochastically canalized allelic signatures represents a novel axis for characterizing organismal variability, complementing traditional approaches based on genetic and environmental factors. We also developed a model to explain the inconsistent penetrance associated with these stochastically canalized allelic expressions. By elucidating mechanisms underlying the persistence of allele-specific expression, we enhance understanding of development's role in shaping organismal diversity.


Asunto(s)
Armadillos , Humanos , Animales , Armadillos/fisiología , Fenotipo , Alelos , Penetrancia
13.
Dev Cell ; 57(16): 1995-2008.e5, 2022 08 22.
Artículo en Inglés | MEDLINE | ID: mdl-35914524

RESUMEN

X-chromosome inactivation (XCI) is a random, permanent, and developmentally early epigenetic event that occurs during mammalian embryogenesis. We harness these features to investigate characteristics of early lineage specification events during human development. We initially assess the consistency of X-inactivation and establish a robust set of XCI-escape genes. By analyzing variance in XCI ratios across tissues and individuals, we find that XCI is shared across all tissues, suggesting that XCI is completed in the epiblast (in at least 6-16 cells) prior to specification of the germ layers. Additionally, we exploit tissue-specific variability to characterize the number of cells present during tissue-lineage commitment, ranging from approximately 20 cells in liver and whole blood tissues to 80 cells in brain tissues. By investigating the variability of XCI ratios using adult tissue, we characterize embryonic features of human XCI and lineage specification that are otherwise difficult to ascertain experimentally.


Asunto(s)
Embrión de Mamíferos , Inactivación del Cromosoma X , Adulto , Animales , Cromosomas Humanos X/genética , Humanos , Mamíferos/genética , Inactivación del Cromosoma X/genética
14.
BMC Genet ; 12: 98, 2011 Nov 13.
Artículo en Inglés | MEDLINE | ID: mdl-22077927

RESUMEN

BACKGROUND: Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control. RESULTS: Here we developed and benchmarked several protocols for GWAS data analysis using different in-silico gene prediction and prioritisation methodologies. We adopted a high sensitivity approach to the data, using less conservative statistical SNP associations. Multiple gene search spaces, either of fixed-widths or proximity-based, were generated around each SNP marker. We used the candidate disease gene prediction system Gentrepid to identify candidates based on shared biomolecular pathways or domain-based protein homology. Predictions were made either with phenotype-specific known disease genes as input; or without a priori knowledge, by exhaustive comparison of genes in distinct loci. Because Gentrepid uses biomolecular data to find interactions and common features between genes in distinct loci of the search spaces, it takes advantage of the multi-locus aspect of the data. CONCLUSIONS: Results suggest testing multiple SNP-to-gene search spaces compensates for differences in phenotypes, populations and SNP platforms. Surprisingly, domain-based homology information was more informative when benchmarked against gene candidates reported by GWA studies compared to previously determined disease genes, possibly suggesting a larger contribution of gene homologs to complex diseases than Mendelian diseases.


Asunto(s)
Enfermedad/genética , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Bases de Datos Genéticas , Bases de Datos de Proteínas , Humanos , Programas Informáticos
15.
PLoS Comput Biol ; 6(2): e1000672, 2010 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-20168992

RESUMEN

Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters.


Asunto(s)
Evolución Molecular , Genoma Bacteriano , Modelos Genéticos , Familia de Multigenes , Proteínas Bacterianas/genética , Análisis por Conglomerados , Biología Computacional/métodos , Reordenamiento Génico , Operón
16.
Cardiovasc Res ; 117(10): 2216-2227, 2021 08 29.
Artículo en Inglés | MEDLINE | ID: mdl-33002116

RESUMEN

AIMS: Cardiac electrical activity is extraordinarily robust. However, when it goes wrong it can have fatal consequences. Electrical activity in the heart is controlled by the carefully orchestrated activity of more than a dozen different ion conductances. While there is considerable variability in cardiac ion channel expression levels between individuals, studies in rodents have indicated that there are modules of ion channels whose expression co-vary. The aim of this study was to investigate whether meta-analytic co-expression analysis of large-scale gene expression datasets could identify modules of co-expressed cardiac ion channel genes in human hearts that are of functional importance. METHODS AND RESULTS: Meta-analysis of 3653 public human RNA-seq datasets identified a strong correlation between expression of CACNA1C (L-type calcium current, ICaL) and KCNH2 (rapid delayed rectifier K+ current, IKr), which was also observed in human adult heart tissue samples. In silico modelling suggested that co-expression of CACNA1C and KCNH2 would limit the variability in action potential duration seen with variations in expression of ion channel genes and reduce susceptibility to early afterdepolarizations, a surrogate marker for proarrhythmia. We also found that levels of KCNH2 and CACNA1C expression are correlated in human-induced pluripotent stem cell-derived cardiac myocytes and the levels of CACNA1C and KCNH2 expression were inversely correlated with the magnitude of changes in repolarization duration following inhibition of IKr. CONCLUSION: Meta-analytic approaches of multiple independent human gene expression datasets can be used to identify gene modules that are important for regulating heart function. Specifically, we have verified that there is co-expression of CACNA1C and KCNH2 ion channel genes in human heart tissue, and in silico analyses suggest that CACNA1C-KCNH2 co-expression increases the robustness of cardiac electrical activity.


Asunto(s)
Potenciales de Acción , Arritmias Cardíacas/metabolismo , Canales de Calcio Tipo L/metabolismo , Canal de Potasio ERG1/metabolismo , Frecuencia Cardíaca , Células Madre Pluripotentes Inducidas/metabolismo , Miocitos Cardíacos/metabolismo , Arritmias Cardíacas/genética , Arritmias Cardíacas/fisiopatología , Arritmias Cardíacas/prevención & control , Canales de Calcio Tipo L/genética , Células Cultivadas , Bases de Datos Genéticas , Canal de Potasio ERG1/genética , Humanos , Modelos Cardiovasculares , RNA-Seq , Transducción de Señal , Factores de Tiempo
17.
BMC Bioinformatics ; 10 Suppl 1: S69, 2009 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-19208173

RESUMEN

BACKGROUND: Automated candidate gene prediction systems allow geneticists to hone in on disease genes more rapidly by identifying the most probable candidate genes linked to the disease phenotypes under investigation. Here we assessed the ability of eight different candidate gene prediction systems to predict disease genes in intervals previously associated with type 2 diabetes by benchmarking their performance against genes implicated by recent genome-wide association studies. RESULTS: Using a search space of 9556 genes, all but one of the systems pruned the genome in favour of genes associated with moderate to highly significant SNPs. Of the 11 genes associated with highly significant SNPs identified by the genome-wide association studies, eight were flagged as likely candidates by at least one of the prediction systems. A list of candidates produced by a previous consensus approach did not match any of the genes implicated by 706 moderate to highly significant SNPs flagged by the genome-wide association studies. We prioritized genes associated with medium significance SNPs. CONCLUSION: The study appraises the relative success of several candidate gene prediction systems against independent genetic data. Even when confronted with challengingly large intervals, the candidate gene prediction systems can successfully select likely disease genes. Furthermore, they can be used to filter statistically less-well-supported genetic data to select more likely candidates. We suggest consensus approaches fail because they penalize novel predictions made from independent underlying databases. To realize their full potential further work needs to be done on prioritization and annotation of genes.


Asunto(s)
Diabetes Mellitus Tipo 2/genética , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Bases de Datos Genéticas , Redes Reguladoras de Genes , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple
18.
Genome Biol ; 20(1): 159, 2019 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-31399121

RESUMEN

The use of the human reference genome has shaped methods and data across modern genomics. This has offered many benefits while creating a few constraints. In the following opinion, we outline the history, properties, and pitfalls of the current human reference genome. In a few illustrative analyses, we focus on its use for variant-calling, highlighting its nearness to a 'type specimen'. We suggest that switching to a consensus reference would offer important advantages over the continued use of the current reference with few disadvantages.


Asunto(s)
Genómica/normas , Genoma Humano , Humanos , Estándares de Referencia
19.
Nat Commun ; 9(1): 884, 2018 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-29491377

RESUMEN

Single-cell RNA-sequencing (scRNA-seq) technology provides a new avenue to discover and characterize cell types; however, the experiment-specific technical biases and analytic variability inherent to current pipelines may undermine its replicability. Meta-analysis is further hampered by the use of ad hoc naming conventions. Here we demonstrate our replication framework, MetaNeighbor, that quantifies the degree to which cell types replicate across datasets, and enables rapid identification of clusters with high similarity. We first measure the replicability of neuronal identity, comparing results across eight technically and biologically diverse datasets to define best practices for more complex assessments. We then apply this to novel interneuron subtypes, finding that 24/45 subtypes have evidence of replication, which enables the identification of robust candidate marker genes. Across tasks we find that large sets of variably expressed genes can identify replicable cell types with high accuracy, suggesting a general route forward for large-scale evaluation of scRNA-seq data.


Asunto(s)
Neuronas/metabolismo , ARN/genética , Biología Computacional , Perfilación de la Expresión Génica , Humanos , Neuronas/citología , ARN/metabolismo , Análisis de Secuencia de ARN , Análisis de la Célula Individual
20.
Genome Med ; 9(1): 64, 2017 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-28687074

RESUMEN

BACKGROUND: Disagreements over genetic signatures associated with disease have been particularly prominent in the field of psychiatric genetics, creating a sharp divide between disease burdens attributed to common and rare variation, with study designs independently targeting each. Meta-analysis within each of these study designs is routine, whether using raw data or summary statistics, but combining results across study designs is atypical. However, tests of functional convergence are used across all study designs, where candidate gene sets are assessed for overlaps with previously known properties. This suggests one possible avenue for combining not study data, but the functional conclusions that they reach. METHOD: In this work, we test for functional convergence in autism spectrum disorder (ASD) across different study types, and specifically whether the degree to which a gene is implicated in autism is correlated with the degree to which it drives functional convergence. Because different study designs are distinguishable by their differences in effect size, this also provides a unified means of incorporating the impact of study design into the analysis of convergence. RESULTS: We detected remarkably significant positive trends in aggregate (p < 2.2e-16) with 14 individually significant properties (false discovery rate <0.01), many in areas researchers have targeted based on different reasoning, such as the fragile X mental retardation protein (FMRP) interactor enrichment (false discovery rate 0.003). We are also able to detect novel technical effects and we see that network enrichment from protein-protein interaction data is heavily confounded with study design, arising readily in control data. CONCLUSIONS: We see a convergent functional signal for a subset of known and novel functions in ASD from all sources of genetic variation. Meta-analytic approaches explicitly accounting for different study designs can be adapted to other diseases to discover novel functional associations and increase statistical power.


Asunto(s)
Trastorno del Espectro Autista/genética , Genómica/métodos , Metaanálisis como Asunto , Mutación , Polimorfismo Genético , Femenino , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/genética , Predisposición Genética a la Enfermedad , Humanos , Masculino , Modelos Genéticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA