Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Pathol ; 254(4): 418-429, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33748968

RESUMEN

Human genetics plays an increasingly important role in drug development and population health. Here we review the history of human genetics in the context of accelerating the discovery of therapies, present examples of how human genetics evidence supports successful drug targets, and discuss how polygenic risk scores could be beneficial in various clinical settings. We highlight the value of direct-to-consumer platforms in the era of fast-paced big data biotechnology, and how diverse genetic and health data can benefit society. © 2021 23andMe, Inc. The Journal of Pathology published by John Wiley & Sons, Ltd. on behalf of The Pathological Society of Great Britain and Ireland.


Asunto(s)
Descubrimiento de Drogas , Genoma Humano , Humanos
2.
Hum Mol Genet ; 25(14): 3096-3105, 2016 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-27260402

RESUMEN

We compared coding region variants of 53 cognitively healthy centenarians and 45 patients with Alzheimer's disease (AD), all of Ashkenazi Jewish (AJ) ancestry. Despite the small sample size, the known AD risk variant APOE4 reached genome-wide significance, indicating the advantage of utilizing 'super-controls'. We restricted our subsequent analysis to rare variants observed at most once in the 1000 Genomes database and having a minor allele frequency below 2% in our AJ sample. We compared the burden of predicted protein altering variants between cases and controls as normalized by the level of rare synonymous variants. We observed an increased burden among AD subjects for predicted loss-of-function (LoFs) variants defined as stop-gain, frame shift, initiation codon (INIT) and splice site mutations (n = 930, OR = 1.3, P = 1.5×E-5). There was no enrichment across all rare protein altering variants defined as missense plus LoFs, in frame indels and stop-loss variants (n = 13 014, OR = 0.97, P = 0.47). Among LoFs, the strongest burden was observed for INIT (OR = 2.16, P = 0.0097) and premature stop variants predicted to cause non-sense-mediated decay in the majority of transcripts (NMD) (OR = 1.98, P = 0.02). Notably, this increased burden of NMD, INIT and splice variants was more pronounced in a set of 1397 innate immune genes (OR = 4.55, P = 0.0043). Further comparison to additional exomes indicates that the difference in LoF burden originated both from the AD and centenarian sample. In summary, we observed an overall increased burden of rare LoFs in AD subjects as compared to centenarians, and this enrichment is more pronounced for innate immune genes.


Asunto(s)
Enfermedad de Alzheimer/genética , Exoma/genética , Predisposición Genética a la Enfermedad , Inmunidad Innata/genética , Inflamación/genética , Anciano de 80 o más Años , Enfermedad de Alzheimer/patología , Apolipoproteína E4/genética , Femenino , Frecuencia de los Genes , Variación Genética , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Mutación INDEL , Inflamación/patología , Judíos/genética , Masculino , Polimorfismo de Nucleótido Simple
3.
Bioinformatics ; 32(20): 3196-3198, 2016 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-27354699

RESUMEN

MOTIVATION: Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a solution, we developed Conpair, a tool for detection of sample swaps and cross-individual contamination in whole-genome and whole-exome tumor-normal sequencing experiments. RESULTS: On a ladder of in silico contaminated samples, we demonstrated that Conpair reliably measures contamination levels as low as 0.1%, even in presence of copy number changes. We also estimated contamination levels in glioblastoma WGS and WXS tumor-normal datasets from TCGA and showed that they strongly correlate with tumor-normal concordance, as well as with the number of germline variants called as somatic by several widely-used somatic callers. AVAILABILITY AND IMPLEMENTATION: The method is available at: https://github.com/nygenome/conpair CONTACT: egrabowska@gmail.com or mczody@nygenome.orgSupplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Simulación por Computador , ADN de Neoplasias , Neoplasias , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias/patología
4.
Nature ; 471(7339): 499-503, 2011 Mar 24.
Artículo en Inglés | MEDLINE | ID: mdl-21346763

RESUMEN

Rare copy number variants (CNVs) have a prominent role in the aetiology of schizophrenia and other neuropsychiatric disorders. Substantial risk for schizophrenia is conferred by large (>500-kilobase) CNVs at several loci, including microdeletions at 1q21.1 (ref. 2), 3q29 (ref. 3), 15q13.3 (ref. 2) and 22q11.2 (ref. 4) and microduplication at 16p11.2 (ref. 5). However, these CNVs collectively account for a small fraction (2-4%) of cases, and the relevant genes and neurobiological mechanisms are not well understood. Here we performed a large two-stage genome-wide scan of rare CNVs and report the significant association of copy number gains at chromosome 7q36.3 with schizophrenia. Microduplications with variable breakpoints occurred within a 362-kilobase region and were detected in 29 of 8,290 (0.35%) patients versus 2 of 7,431 (0.03%) controls in the combined sample. All duplications overlapped or were located within 89 kilobases upstream of the vasoactive intestinal peptide receptor gene VIPR2. VIPR2 transcription and cyclic-AMP signalling were significantly increased in cultured lymphocytes from patients with microduplications of 7q36.3. These findings implicate altered vasoactive intestinal peptide signalling in the pathogenesis of schizophrenia and indicate the VPAC2 receptor as a potential target for the development of new antipsychotic drugs.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Genes Duplicados/genética , Predisposición Genética a la Enfermedad/genética , Receptores de Tipo II del Péptido Intestinal Vasoactivo/genética , Esquizofrenia/genética , Línea Celular , Cromosomas Humanos Par 7/genética , Estudios de Cohortes , AMP Cíclico/metabolismo , Femenino , Dosificación de Gen/genética , Estudio de Asociación del Genoma Completo , Humanos , Patrón de Herencia/genética , Masculino , Linaje , Receptores de Tipo II del Péptido Intestinal Vasoactivo/metabolismo , Reproducibilidad de los Resultados , Esquizofrenia/metabolismo , Transducción de Señal , Transcripción Genética/genética
5.
Hum Mol Genet ; 23(17): 4693-702, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-24842889

RESUMEN

The recent series of large genome-wide association studies in European and Japanese cohorts established that Parkinson disease (PD) has a substantial genetic component. To further investigate the genetic landscape of PD, we performed a genome-wide scan in the largest to date Ashkenazi Jewish cohort of 1130 Parkinson patients and 2611 pooled controls. Motivated by the reduced disease allele heterogeneity and a high degree of identical-by-descent (IBD) haplotype sharing in this founder population, we conducted a haplotype association study based on mapping of shared IBD segments. We observed significant haplotype association signals at three previously implicated Parkinson loci: LRRK2 (OR = 12.05, P = 1.23 × 10(-56)), MAPT (OR = 0.62, P = 1.78 × 10(-11)) and GBA (multiple distinct haplotypes, OR > 8.28, P = 1.13 × 10(-11) and OR = 2.50, P = 1.22 × 10(-9)). In addition, we identified a novel association signal on chr2q14.3 coming from a rare haplotype (OR = 22.58, P = 1.21 × 10(-10)) and replicated it in a secondary cohort of 306 Ashkenazi PD cases and 2583 controls. Our results highlight the power of our haplotype association method, particularly useful in studies of founder populations, and reaffirm the benefits of studying complex diseases in Ashkenazi Jewish cohorts.


Asunto(s)
Mapeo Cromosómico , Etnicidad/genética , Genealogía y Heráldica , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Enfermedad de Parkinson/genética , Anciano , Estudios de Cohortes , Demografía , Femenino , Sitios Genéticos/genética , Haplotipos/genética , Humanos , Masculino , Polimorfismo de Nucleótido Simple/genética , Reproducibilidad de los Resultados
6.
Am J Med Genet B Neuropsychiatr Genet ; 168(8): 649-59, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26198764

RESUMEN

Schizophrenia is a common, clinically heterogeneous disorder associated with lifelong morbidity and early mortality. Several genetic variants associated with schizophrenia have been identified, but the majority of the heritability remains unknown. In this study, we report on a case-control sample of Ashkenazi Jews (AJ), a founder population that may provide additional insights into genetic etiology of schizophrenia. We performed a genome-wide association analysis (GWAS) of 592 cases and 505 controls of AJ ancestry ascertained in the US. Subsequently, we performed a meta-analysis with an Israeli AJ sample of 913 cases and 1640 controls, followed by a meta-analysis and polygenic risk scoring using summary results from Psychiatric GWAS Consortium 2 schizophrenia study. The U.S. AJ sample showed strong evidence of polygenic inheritance (pseudo-R(2) ∼9.7%) and a SNP-heritability estimate of 0.39 (P = 0.00046). We found no genome-wide significant associations in the U.S. sample or in the combined US/Israeli AJ meta-analysis of 1505 cases and 2145 controls. The strongest AJ specific associations (P-values in 10(-6) -10(-7) range) were in the 22q 11.2 deletion region and included the genes TBX1, GLN1, and COMT. Supportive evidence (meta P < 1 × 10(-4) ) was also found for several previously identified genome-wide significant findings, including the HLA region, CNTN4, IMMP2L, and GRIN2A. The meta-analysis of the U.S. sample with the PGC2 results provided initial genome-wide significant evidence for six new loci. Among the novel potential susceptibility genes is PEPD, a gene involved in proline metabolism, which is associated with a Mendelian disorder characterized by developmental delay and cognitive deficits.


Asunto(s)
Judíos/genética , Esquizofrenia/genética , Estudios de Casos y Controles , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Israel/epidemiología , Judíos/estadística & datos numéricos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Esquizofrenia/epidemiología , Estados Unidos/epidemiología
7.
PLoS Comput Biol ; 8(10): e1002709, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23055912

RESUMEN

The effects of disease mutations on protein structure and function have been extensively investigated, and many predictors of the functional impact of single amino acid substitutions are publicly available. The majority of these predictors are based on protein structure and evolutionary conservation, following the assumption that disease mutations predominantly affect folded and conserved protein regions. However, the prevalence of the intrinsically disordered proteins (IDPs) and regions (IDRs) in the human proteome together with their lack of fixed structure and low sequence conservation raise a question about the impact of disease mutations in IDRs. Here, we investigate annotated missense disease mutations and show that 21.7% of them are located within such intrinsically disordered regions. We further demonstrate that 20% of disease mutations in IDRs cause local disorder-to-order transitions, which represents a 1.7-2.7 fold increase compared to annotated polymorphisms and neutral evolutionary substitutions, respectively. Secondary structure predictions show elevated rates of transition from helices and strands into loops and vice versa in the disease mutations dataset. Disease disorder-to-order mutations also influence predicted molecular recognition features (MoRFs) more often than the control mutations. The repertoire of disorder-to-order transition mutations is limited, with five most frequent mutations (R→W, R→C, E→K, R→H, R→Q) collectively accounting for 44% of all deleterious disorder-to-order transitions. As a proof of concept, we performed accelerated molecular dynamics simulations on a deleterious disorder-to-order transition mutation of tumor protein p63 and, in agreement with our predictions, observed an increased α-helical propensity of the region harboring the mutation. Our findings highlight the importance of mutations in IDRs and refine the traditional structure-centric view of disease mutations. The results of this study offer a new perspective on the role of mutations in disease, with implications for improving predictors of the functional impact of missense mutations.


Asunto(s)
Enfermedad/genética , Modelos Genéticos , Mutación , Proteínas/genética , Arginina/genética , Análisis por Conglomerados , Biología Computacional , Humanos , Simulación de Dinámica Molecular , Conformación Proteica , Proteínas/química , Proteínas/metabolismo , Análisis de Secuencia de ADN , Factores de Transcripción , Proteínas Supresoras de Tumor
8.
Proc Natl Acad Sci U S A ; 106(48): 20429-34, 2009 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-19915147

RESUMEN

Although remission rates for metastatic melanoma are generally very poor, some patients can survive for prolonged periods following metastasis. We used gene expression profiling, mitotic index (MI), and quantification of tumor infiltrating leukocytes (TILs) and CD3+ cells in metastatic lesions to search for a molecular basis for this observation and to develop improved methods for predicting patient survival. We identified a group of 266 genes associated with postrecurrence survival. Genes positively associated with survival were predominantly immune response related (e.g., ICOS, CD3d, ZAP70, TRAT1, TARP, GZMK, LCK, CD2, CXCL13, CCL19, CCR7, VCAM1) while genes negatively associated with survival were cell proliferation related (e.g., PDE4D, CDK2, GREF1, NUSAP1, SPC24). Furthermore, any of the 4 parameters (prevalidated gene expression signature, TILs, CD3, and in particular MI) improved the ability of Tumor, Node, Metastasis (TNM) staging to predict postrecurrence survival; MI was the most significant contributor (HR = 2.13, P = 0.0008). An immune response gene expression signature and presence of TILs and CD3+ cells signify immune surveillance as a mechanism for prolonged survival in these patients and indicate improved patient subcategorization beyond current TNM staging.


Asunto(s)
Regulación Neoplásica de la Expresión Génica/inmunología , Melanoma/diagnóstico , Melanoma/genética , Estadificación de Neoplasias/métodos , Perfilación de la Expresión Génica/métodos , Humanos , Inmunohistoquímica , Linfocitos Infiltrantes de Tumor/patología , Melanoma/inmunología , Melanoma/secundario , Índice Mitótico/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos , Pronóstico , Análisis de Supervivencia
9.
Proteins ; 78(2): 365-80, 2010 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-19722269

RESUMEN

Ubiquitination plays an important role in many cellular processes and is implicated in many diseases. Experimental identification of ubiquitination sites is challenging due to rapid turnover of ubiquitinated proteins and the large size of the ubiquitin modifier. We identified 141 new ubiquitination sites using a combination of liquid chromatography, mass spectrometry, and mutant yeast strains. Investigation of the sequence biases and structural preferences around known ubiquitination sites indicated that their properties were similar to those of intrinsically disordered protein regions. Using a combined set of new and previously known ubiquitination sites, we developed a random forest predictor of ubiquitination sites, UbPred. The class-balanced accuracy of UbPred reached 72%, with the area under the ROC curve at 80%. The application of UbPred showed that high confidence Rsp5 ubiquitin ligase substrates and proteins with very short half-lives were significantly enriched in the number of predicted ubiquitination sites. Proteome-wide prediction of ubiquitination sites in Saccharomyces cerevisiae indicated that highly ubiquitinated substrates were prevalent among transcription/enzyme regulators and proteins involved in cell cycle control. In the human proteome, cytoskeletal, cell cycle, regulatory, and cancer-associated proteins display higher extent of ubiquitination than proteins from other functional categories. We show that gain and loss of predicted ubiquitination sites may likely represent a molecular mechanism behind a number of disease-associatedmutations. UbPred is available at http://www.ubpred.org.


Asunto(s)
Proteoma/análisis , Proteínas de Saccharomyces cerevisiae/análisis , Saccharomyces cerevisiae/metabolismo , Proteínas Ubiquitinadas/análisis , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Complejos de Clasificación Endosomal Requeridos para el Transporte/metabolismo , Humanos , Espectrometría de Masas , Datos de Secuencia Molecular , Proteoma/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Análisis de Secuencia de Proteína , Complejos de Ubiquitina-Proteína Ligasa/metabolismo , Proteínas Ubiquitinadas/metabolismo , Ubiquitinación
10.
Nucleic Acids Res ; 35(Database issue): D786-93, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17145717

RESUMEN

The Database of Protein Disorder (DisProt) links structure and function information for intrinsically disordered proteins (IDPs). Intrinsically disordered proteins do not form a fixed three-dimensional structure under physiological conditions, either in their entireties or in segments or regions. We define IDP as a protein that contains at least one experimentally determined disordered region. Although lacking fixed structure, IDPs and regions carry out important biological functions, being typically involved in regulation, signaling and control. Such functions can involve high-specificity low-affinity interactions, the multiple binding of one protein to many partners and the multiple binding of many proteins to one partner. These three features are all enabled and enhanced by protein intrinsic disorder. One of the major hindrances in the study of IDPs has been the lack of organized information. DisProt was developed to enable IDP research by collecting and organizing knowledge regarding the experimental characterization and the functional associations of IDPs. In addition to being a unique source of biological information, DisProt opens doors for a plethora of bioinformatics studies. DisProt is openly available at http://www.disprot.org.


Asunto(s)
Bases de Datos de Proteínas , Conformación Proteica , Internet , Pliegue de Proteína , Proteínas/fisiología , Interfaz Usuario-Computador
11.
Nat Genet ; 51(3): 394-403, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30804565

RESUMEN

Insomnia is the second most prevalent mental disorder, with no sufficient treatment available. Despite substantial heritability, insight into the associated genes and neurobiological pathways remains limited. Here, we use a large genetic association sample (n = 1,331,010) to detect novel loci and gain insight into the pathways, tissue and cell types involved in insomnia complaints. We identify 202 loci implicating 956 genes through positional, expression quantitative trait loci, and chromatin mapping. The meta-analysis explained 2.6% of the variance. We show gene set enrichments for the axonal part of neurons, cortical and subcortical tissues, and specific cell types, including striatal, hypothalamic, and claustrum neurons. We found considerable genetic correlations with psychiatric traits and sleep duration, and modest correlations with other sleep-related traits. Mendelian randomization identified the causal effects of insomnia on depression, diabetes, and cardiovascular disease, and the protective effects of educational attainment and intracranial volume. Our findings highlight key brain areas and cell types implicated in insomnia, and provide new treatment targets.


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Sitios de Carácter Cuantitativo/genética , Trastornos del Inicio y del Mantenimiento del Sueño/genética , Cromatina/genética , Femenino , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Persona de Mediana Edad , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sueño/genética
13.
BMC Med Genomics ; 12(1): 56, 2019 04 25.
Artículo en Inglés | MEDLINE | ID: mdl-31023376

RESUMEN

BACKGROUND: Prompted by the revolution in high-throughput sequencing and its potential impact for treating cancer patients, we initiated a clinical research study to compare the ability of different sequencing assays and analysis methods to analyze glioblastoma tumors and generate real-time potential treatment options for physicians. METHODS: A consortium of seven institutions in New York City enrolled 30 patients with glioblastoma and performed tumor whole genome sequencing (WGS) and RNA sequencing (RNA-seq; collectively WGS/RNA-seq); 20 of these patients were also analyzed with independent targeted panel sequencing. We also compared results of expert manual annotations with those from an automated annotation system, Watson Genomic Analysis (WGA), to assess the reliability and time required to identify potentially relevant pharmacologic interventions. RESULTS: WGS/RNAseq identified more potentially actionable clinical results than targeted panels in 90% of cases, with an average of 16-fold more unique potentially actionable variants identified per individual; 84 clinically actionable calls were made using WGS/RNA-seq that were not identified by panels. Expert annotation and WGA had good agreement on identifying variants [mean sensitivity = 0.71, SD = 0.18 and positive predictive value (PPV) = 0.80, SD = 0.20] and drug targets when the same variants were called (mean sensitivity = 0.74, SD = 0.34 and PPV = 0.79, SD = 0.23) across patients. Clinicians used the information to modify their treatment plan 10% of the time. CONCLUSION: These results present the first comprehensive comparison of technical and machine augmented analysis of targeted panel and WGS/RNA-seq to identify potential cancer treatments.


Asunto(s)
Glioblastoma/tratamiento farmacológico , Glioblastoma/genética , Secuenciación Completa del Genoma , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Persona de Mediana Edad , Terapia Molecular Dirigida , Ploidias , Reproducibilidad de los Resultados
14.
BMC Genomics ; 9 Suppl 2: S1, 2008 Sep 16.
Artículo en Inglés | MEDLINE | ID: mdl-18831774

RESUMEN

BACKGROUND: Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins. RESULTS: Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships. CONCLUSION: Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets.


Asunto(s)
Biología Computacional , Pliegue de Proteína , Proteínas/química , Proteínas/metabolismo , Algoritmos , Empalme Alternativo , Secuencia de Aminoácidos , Sitios de Unión , Diseño de Fármacos , Humanos , Conformación Proteica , Análisis de Secuencia de Proteína , Relación Estructura-Actividad
15.
BMC Mol Biol ; 9: 6, 2008 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-18194570

RESUMEN

BACKGROUND: In spite of large intergenic spaces in plant and animal genomes, 7% to 30% of genes in the genomes encode overlapping cis-natural antisense transcripts (cis-NATs). The widespread occurrence of cis-NATs suggests an evolutionary advantage for this type of genomic arrangement. Experimental evidence for the regulation of two cis-NAT gene pairs by natural antisense transcripts-generated small interfering RNAs (nat-siRNAs) via the RNA interference (RNAi) pathway has been reported in Arabidopsis. However, the extent of siRNA-mediated regulation of cis-NAT genes is still unclear in any genome. RESULTS: The hallmarks of RNAi regulation of NATs are 1) inverse regulation of two genes in a cis-NAT pair by environmental and developmental cues and 2) generation of siRNAs by cis-NAT genes. We examined Arabidopsis transcript profiling data from public microarray databases to identify cis-NAT pairs whose sense and antisense transcripts show opposite expression changes. A subset of the cis-NAT genes displayed negatively correlated expression profiles as well as inverse differential expression changes under at least one of the examined developmental stages or treatment conditions. By searching the Arabidopsis Small RNA Project (ASRP) and Massively Parallel Signature Sequencing (MPSS) small RNA databases as well as our stress-treated small RNA dataset, we found small RNAs that matched at least one gene in 646 pairs out of 1008 (64%) protein-coding cis-NAT pairs, which suggests that siRNAs may regulate the expression of many cis-NAT genes. 209 putative siRNAs have the potential to target more than one gene and half of these small RNAs could target multiple members of a gene family. Furthermore, the majority of the putative siRNAs within the overlapping regions tend to target only one transcript of a given NAT pair, which is consistent with our previous finding on salt- and bacteria-induced nat-siRNAs. In addition, we found that genes encoding plastid- or mitochondrion-targeted proteins are over-represented in the Arabidopsis cis-NATs and that 19% of sense and antisense partner genes of cis-NATs share at least one common Gene Ontology term, which suggests that they encode proteins with possible functional connection. CONCLUSION: The negatively correlated expression patterns of sense and antisense genes as well as the presence of siRNAs in many of the cis-NATs suggest that siRNA regulation of cis-NATs via the RNAi pathway is an important gene regulatory mechanism for at least a subgroup of cis-NATs in Arabidopsis.


Asunto(s)
Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas , Interferencia de ARN , ARN sin Sentido/metabolismo , ARN Interferente Pequeño/metabolismo , Perfilación de la Expresión Génica , ARN sin Sentido/genética , ARN Interferente Pequeño/genética
16.
Nat Commun ; 9(1): 1178, 2018 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-29563502

RESUMEN

Hyperemesis gravidarum (HG), severe nausea and vomiting of pregnancy, occurs in 0.3-2% of pregnancies and is associated with maternal and fetal morbidity. The cause of HG remains unknown, but familial aggregation and results of twin studies suggest that understanding the genetic contribution is essential for comprehending the disease etiology. Here, we conduct a genome-wide association study (GWAS) for binary (HG) and ordinal (severity of nausea and vomiting) phenotypes of pregnancy complications. Two loci, chr19p13.11 and chr4q12, are genome-wide significant (p < 5 × 10-8) in both association scans and are replicated in an independent cohort. The genes implicated at these two loci are GDF15 and IGFBP7 respectively, both known to be involved in placentation, appetite, and cachexia. While proving the casual roles of GDF15 and IGFBP7 in nausea and vomiting of pregnancy requires further study, this GWAS provides insights into the genetic risk factors contributing to the disease.


Asunto(s)
Factor 15 de Diferenciación de Crecimiento/genética , Hiperemesis Gravídica/genética , Proteínas de Unión a Factor de Crecimiento Similar a la Insulina/genética , Náusea/genética , Placenta/metabolismo , Complicaciones del Embarazo/genética , Vómitos/genética , Adulto , Apetito/genética , Cromosomas Humanos Par 19 , Cromosomas Humanos Par 4 , Estudios de Cohortes , Femenino , Expresión Génica , Genoma Humano , Estudio de Asociación del Genoma Completo , Factor 15 de Diferenciación de Crecimiento/metabolismo , Humanos , Hiperemesis Gravídica/metabolismo , Hiperemesis Gravídica/fisiopatología , Proteínas de Unión a Factor de Crecimiento Similar a la Insulina/metabolismo , Náusea/etiología , Náusea/metabolismo , Náusea/fisiopatología , Fenotipo , Placenta/patología , Embarazo , Complicaciones del Embarazo/metabolismo , Complicaciones del Embarazo/fisiopatología , Sitios de Carácter Cuantitativo , Factores de Riesgo , Índice de Severidad de la Enfermedad , Vómitos/metabolismo , Vómitos/fisiopatología
17.
Commun Biol ; 1: 20, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30271907

RESUMEN

Reliable detection of somatic variations is of critical importance in cancer research. Here we present Lancet, an accurate and sensitive somatic variant caller, which detects SNVs and indels by jointly analyzing reads from tumor and matched normal samples using colored de Bruijn graphs. We demonstrate, through extensive experimental comparison on synthetic and real whole-genome sequencing datasets, that Lancet has better accuracy, especially for indel detection, than widely used somatic callers, such as MuTect, MuTect2, LoFreq, Strelka, and Strelka2. Lancet features a reliable variant scoring system, which is essential for variant prioritization, and detects low-frequency mutations without sacrificing the sensitivity to call longer insertions and deletions empowered by the local-assembly engine. In addition to genome-wide analysis, Lancet allows inspection of somatic variants in graph space, which augments the traditional read alignment visualization to help confirm a variant of interest. Lancet is available as an open-source program at https://github.com/nygenome/lancet.

19.
BMC Bioinformatics ; 8: 211, 2007 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-17578581

RESUMEN

BACKGROUND: Composition Profiler is a web-based tool for semi-automatic discovery of enrichment or depletion of amino acids, either individually or grouped by their physico-chemical or structural properties. RESULTS: The program takes two samples of amino acids as input: a query sample and a reference sample. The latter provides a suitable background amino acid distribution, and should be chosen according to the nature of the query sample, for example, a standard protein database (e.g. SwissProt, PDB), a representative sample of proteins from the organism under study, or a group of proteins with a contrasting functional annotation. The results of the analysis of amino acid composition differences are summarized in textual and graphical form. CONCLUSION: As an exploratory data mining tool, our software can be used to guide feature selection for protein function or structure predictors. For classes of proteins with significant differences in frequencies of amino acids having particular physico-chemical (e.g. hydrophobicity or charge) or structural (e.g. alpha helix propensity) properties, Composition Profiler can be used as a rough, light-weight visual classifier.


Asunto(s)
Algoritmos , Proteínas/química , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Aminoácidos , Gráficos por Computador , Datos de Secuencia Molecular
20.
J Comput Biol ; 14(9): 1160-75, 2007 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17990975

RESUMEN

The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence similarity search, we recently proposed a new combinatorial approach that combines sequence similarity and genome rearrangement. This paper continues the development of the approach and unites genome rearrangement events and (post-speciation) duplication events in a single framework under the parsimony principle. In this framework, orthologous genes are assumed to correspond to each other in the most parsimonious evolutionary scenario involving both genome rearrangement and (post-speciation) gene duplication. Besides several original algorithmic contributions, the enhanced method allows for the detection of inparalogs. Following this approach, we have implemented a high-throughput system for ortholog assignment on a genome scale, called MSOAR, and applied it to human and mouse genomes. As the result will show, MSOAR is able to find 99 more true orthologs than the INPARANOID program did. In comparison to the iterated exemplar algorithm on simulated data, MSOAR performed favorably in terms of assignment accuracy. We also validated our predicted main ortholog pairs between human and mouse using public ortholog assignment datasets, synteny information, and gene function classification. These test results indicate that our approach is very promising for genome-wide ortholog assignment. Supplemental material and MSOAR program are available at http://msoar.cs.ucr.edu.


Asunto(s)
Biología Computacional/métodos , Genoma/genética , Homología de Secuencia de Ácido Nucleico , Programas Informáticos , Algoritmos , Animales , Cromosomas de los Mamíferos/genética , Simulación por Computador , Evolución Molecular , Humanos , Ratones , Proteoma , Sintenía
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA