Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
medRxiv ; 2023 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-37732190

RESUMEN

Purpose: The risk of developing age-related macular degeneration(AMD) is influenced by genetic background. In 2016, International AMD Genomics Consortium(IAMDGC) identified 52 risk variants in 34 loci, and a polygenic risk score(PRS) based on these variants was associated with AMD. The Israeli population has a unique genetic composition: Ashkenazi Jewish(AJ), Jewish non-Ashkenazi, and Arab sub-populations. We aimed to perform a genome-wide association study(GWAS) for AMD in Israel, and to evaluate PRSs for AMD. Methods: For our discovery set, we recruited 403 AMD patients and 256 controls at Hadassah Medical Center. We genotyped all individuals via custom exome chip. We imputed non-typed variants using cosmopolitan and AJ reference panels. We recruited additional 155 cases and 69 controls for validation. To evaluate predictive power of PRSs for AMD, we used IAMDGC summary statistics excluding our study and developed PRSs via either clumping/thresholding or LDpred2. Results: In our discovery set, 31/34 loci previously reported by the IAMDGC were AMD associated with P<0.05. Of those, all effects were directionally consistent with the IAMDGC and 11 loci had a p-value under Bonferroni-corrected threshold(0.05/34=0.0015). At a threshold of 5x10 -5 , we discovered four suggestive associations in FAM189A1 , IGDCC4 , C7orf50 , and CNTNAP4 . However, only the FAM189A1 variant was AMD associated in the replication cohort after Bonferroni-correction. A prediction model including LDpred2-based PRS and other covariates had an AUC of 0.82(95%CI:0.79-0.85) and performed better than a covariates-only model(P=5.1x10 -9 ). Conclusions: Previously reported AMD-associated loci were nominally associated with AMD in Israel. A PRS developed based on a large international study is predictive in Israeli populations.

2.
Genome Res ; 33(7): 1023-1031, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37562965

RESUMEN

The pairwise sequentially Markovian coalescent (PSMC) algorithm and its extensions infer the coalescence time of two homologous chromosomes at each genomic position. This inference is used in reconstructing demographic histories, detecting selection signatures, studying genome-wide associations, constructing ancestral recombination graphs, and more. Inference of coalescence times between each pair of haplotypes in a large data set is of great interest, as they may provide rich information about the population structure and history of the sample. Here, we introduce a new method, Gamma-SMC, which is more than 10 times faster than current methods. To obtain this speed-up, we represent the posterior coalescence time distributions succinctly as a gamma distribution with just two parameters; in contrast, PSMC and its extensions hold these in a vector over discrete intervals of time. Thus, Gamma-SMC has constant time-complexity per site, without dependence on the number of discrete time states. Additionally, because of this continuous representation, our method is able to infer times spanning many orders of magnitude and, as such, is robust to parameter misspecification. We describe how this approach works, show its performance on simulated and real data, and illustrate its use in studying recent positive selection in the 1000 Genomes Project data set.


Asunto(s)
Genoma , Genómica , Haplotipos , Cromosomas/genética , Algoritmos , Modelos Genéticos , Genética de Población
3.
Invest Ophthalmol Vis Sci ; 61(2): 48, 2020 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-32106291

RESUMEN

Purpose: Anti-vascular endothelial growth factor (VEGF) therapy for neovascular AMD (nvAMD) obtains a variable outcome. We performed a genome-wide association study for anti-VEGF treatment response in nvAMD to identify variants potentially underlying such a variable outcome. Methods: Israeli patients with nvAMD who underwent anti-VEGF treatment (n = 187) were genotyped on a whole exome chip containing approximately 500,000 variants. Genotyping was correlated with delta visual acuity (deltaVA) between baseline and after three injections of anti-VEGF. Top principal components, age, and baseline VA were included in the analysis. Two lead associated variants were genotyped in an independent validation set of patients with nvAMD (n = 108). Results: Linear regression analysis on 5,353,842 variants revealed five exonic variants with an association P value of less than 6 × 10-5. The top variant in the gene VWA3A (P = 1.77 × 10-6) was tested in the validation cohort. The minor allele of the VWA3A variant was associated with worse response to treatment (P = 0.02). The average deltaVA of discovery plus validation was -0.214 logMAR (≈ a gain of 10.7 Early Treatment Diabetic Retinopathy Study letters) for homozygote for the major allele, 0.172 logMAR for heterozygotes (≈ a loss of 8.6 Early Treatment Diabetic Retinopathy Study letters), and 0.21 logMAR for homozygote for the minor allele (≈ a loss of 10.5 Early Treatment Diabetic Retinopathy Study letters). Minor allele carriers had a higher frequency of macular hemorrhage at baseline. Conclusions: An VWA3A gene variant was associated with worse response to anti-VEGF treatment in Israeli patients with nvAMD. The VWA3A protein is a precursor of the multimeric von Willebrand factor which is involved in blood coagulation, a system previously associated with nvAMD.


Asunto(s)
Inhibidores de la Angiogénesis/uso terapéutico , Neovascularización Coroidal , Precursores de Proteínas/genética , Degeneración Macular Húmeda , Anciano , Anciano de 80 o más Años , Neovascularización Coroidal/tratamiento farmacológico , Neovascularización Coroidal/genética , Femenino , Humanos , Israel , Masculino , Persona de Mediana Edad , Análisis de Regresión , Agudeza Visual , Degeneración Macular Húmeda/tratamiento farmacológico , Degeneración Macular Húmeda/genética , Factor de von Willebrand/genética
4.
Nat Commun ; 10(1): 3417, 2019 07 31.
Artículo en Inglés | MEDLINE | ID: mdl-31366909

RESUMEN

High costs and technical limitations of cell sorting and single-cell techniques currently restrict the collection of large-scale, cell-type-specific DNA methylation data. This, in turn, impedes our ability to tackle key biological questions that pertain to variation within a population, such as identification of disease-associated genes at a cell-type-specific resolution. Here, we show mathematically and empirically that cell-type-specific methylation levels of an individual can be learned from its tissue-level bulk data, conceptually emulating the case where the individual has been profiled with a single-cell resolution and then signals were aggregated in each cell population separately. Provided with this unprecedented way to perform powerful large-scale epigenetic studies with cell-type-specific resolution, we revisit previous studies with tissue-level bulk methylation and reveal novel associations with leukocyte composition in blood and with rheumatoid arthritis. For the latter, we further show consistency with validation data collected from sorted leukocyte sub-types.


Asunto(s)
Separación Celular/métodos , Biología Computacional/métodos , Metilación de ADN/genética , Epigénesis Genética/genética , Análisis de la Célula Individual/métodos , Artritis Reumatoide/sangre , Islas de CpG/genética , Humanos , Recuento de Leucocitos , Leucocitos/clasificación , Leucocitos/citología
5.
Bioinformatics ; 35(12): 2162-2164, 2019 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-30445428

RESUMEN

MOTIVATION: Hidden Markov models (HMMs) are powerful tools for modeling processes along the genome. In a standard genomic HMM, observations are drawn, at each genomic position, from a distribution whose parameters depend on a hidden state, and the hidden states evolve along the genome as a Markov chain. Often, the hidden state is the Cartesian product of multiple processes, each evolving independently along the genome. Inference in these so-called Factorial HMMs has a naïve running time that scales as the square of the number of possible states, which by itself increases exponentially with the number of sub-chains; such a running time scaling is impractical for many applications. While faster algorithms exist, there is no available implementation suitable for developing bioinformatics applications. RESULTS: We developed FactorialHMM, a Python package for fast exact inference in Factorial HMMs. Our package allows simulating either directly from the model or from the posterior distribution of states given the observations. Additionally, we allow the inference of all key quantities related to HMMs: (i) the (Viterbi) sequence of states with the highest posterior probability; (ii) the likelihood of the data and (iii) the posterior probability (given all observations) of the marginal and pairwise state probabilities. The running time and space requirement of all procedures is linearithmic in the number of possible states. Our package is highly modular, providing the user with maximal flexibility for developing downstream applications. AVAILABILITY AND IMPLEMENTATION: https://github.com/regevs/factorial_hmm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Genoma , Genómica , Cadenas de Markov , Probabilidad , Programas Informáticos
6.
Nat Commun ; 9(1): 4919, 2018 11 21.
Artículo en Inglés | MEDLINE | ID: mdl-30464216

RESUMEN

Testing for association between a set of genetic markers and a phenotype is a fundamental task in genetic studies. Standard approaches for heritability and set testing strongly rely on parametric models that make specific assumptions regarding phenotypic variability. Here, we show that resulting p-values may be inflated by up to 15 orders of magnitude, in a heritability study of methylation measurements, and in a heritability and expression quantitative trait loci analysis of gene expression profiles. We propose FEATHER, a method for fast permutation-based testing of marker sets and of heritability, which properly controls for false-positive results. FEATHER eliminated 47% of methylation sites found to be heritable by the parametric test, suggesting a substantial inflation of false-positive findings by alternative methods. Our approach can rapidly identify heritable phenotypes out of millions of phenotypes acquired via high-throughput technologies, does not suffer from model misspecification and is highly efficient.


Asunto(s)
Técnicas Genéticas , Carácter Cuantitativo Heredable , Estadística como Asunto , Metilación de ADN , Expresión Génica , Fenotipo
7.
Genome Biol ; 19(1): 141, 2018 09 21.
Artículo en Inglés | MEDLINE | ID: mdl-30241486

RESUMEN

We introduce a Bayesian semi-supervised method for estimating cell counts from DNA methylation by leveraging an easily obtainable prior knowledge on the cell-type composition distribution of the studied tissue. We show mathematically and empirically that alternative methods which attempt to infer cell counts without methylation reference only capture linear combinations of cell counts rather than provide one component per cell type. Our approach allows the construction of components such that each component corresponds to a single cell type, and provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before.


Asunto(s)
Recuento de Células/métodos , Metilación de ADN , Teorema de Bayes
8.
J Comput Biol ; 25(7): 794-808, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-29932739

RESUMEN

Estimation of heritability is an important task in genetics. The use of linear mixed models (LMMs) to determine narrow-sense single-nucleotide polymorphism (SNP)-heritability and related quantities has received much recent attention, due of its ability to account for variants with small effect sizes. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. The common way to report the uncertainty in REML estimation uses standard errors (SEs), which rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals (CIs). In addition, for larger data sets (e.g., tens of thousands of individuals), the construction of SEs itself may require considerable time, as it requires expensive matrix inversions and multiplications. Here, we present FIESTA (Fast confidence IntErvals using STochastic Approximation), a method for constructing accurate CIs. FIESTA is based on parametric bootstrap sampling, and, therefore, avoids unjustified assumptions on the distribution of the heritability estimator. FIESTA uses stochastic approximation techniques, which accelerate the construction of CIs by several orders of magnitude, compared with previous approaches as well as to the analytical approximation used by SEs. FIESTA builds accurate CIs rapidly, for example, requiring only several seconds for data sets of tens of thousands of individuals, making FIESTA a very fast solution to the problem of building accurate CIs for heritability for all data set sizes.


Asunto(s)
Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Modelos Estadísticos , Sitios de Carácter Cuantitativo/genética , Simulación por Computador , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos
9.
Genetics ; 207(4): 1275-1283, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29025915

RESUMEN

Testing for the existence of variance components in linear mixed models is a fundamental task in many applicative fields. In statistical genetics, the score test has recently become instrumental in the task of testing an association between a set of genetic markers and a phenotype. With few markers, this amounts to set-based variance component tests, which attempt to increase power in association studies by aggregating weak individual effects. When the entire genome is considered, it allows testing for the heritability of a phenotype, defined as the proportion of phenotypic variance explained by genetics. In the popular score-based Sequence Kernel Association Test (SKAT) method, the assumed distribution of the score test statistic is uncalibrated in small samples, with a correction being computationally expensive. This may cause severe inflation or deflation of P-values, even when the null hypothesis is true. Here, we characterize the conditions under which this discrepancy holds, and show it may occur also in large real datasets, such as a dataset from the Wellcome Trust Case Control Consortium 2 (n = 13,950) study, and, in particular, when the individuals in the sample are unrelated. In these cases, the SKAT approximation tends to be highly overconservative and therefore underpowered. To address this limitation, we suggest an efficient method to calculate exact P-values for the score test in the case of a single variance component and a continuous response vector, which can speed up the analysis by orders of magnitude. Our results enable fast and accurate application of the score test in heritability and in set-based association tests. Our method is available in http://github.com/cozygene/RL-SKAT.


Asunto(s)
Estudios de Asociación Genética/estadística & datos numéricos , Marcadores Genéticos , Variación Genética , Genoma/genética , Algoritmos , Simulación por Computador , Humanos , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos
10.
Bioinformatics ; 33(14): i325-i332, 2017 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-28881982

RESUMEN

MOTIVATION: Epigenome-wide association studies can provide novel insights into the regulation of genes involved in traits and diseases. The rapid emergence of bisulfite-sequencing technologies enables performing such genome-wide studies at the resolution of single nucleotides. However, analysis of data produced by bisulfite-sequencing poses statistical challenges owing to low and uneven sequencing depth, as well as the presence of confounding factors. The recently introduced Mixed model Association for Count data via data AUgmentation (MACAU) can address these challenges via a generalized linear mixed model when confounding can be encoded via a single variance component. However, MACAU cannot be used in the presence of multiple variance components. Additionally, MACAU uses a computationally expensive Markov Chain Monte Carlo (MCMC) procedure, which cannot directly approximate the model likelihood. RESULTS: We present a new method, Mixed model Association via a Laplace ApproXimation (MALAX), that is more computationally efficient than MACAU and allows to model multiple variance components. MALAX uses a Laplace approximation rather than MCMC based approximations, which enables to directly approximate the model likelihood. Through an extensive analysis of simulated and real data, we demonstrate that MALAX successfully addresses statistical challenges introduced by bisulfite-sequencing while controlling for complex sources of confounding, and can be over 50% faster than the state of the art. AVAILABILITY AND IMPLEMENTATION: The full source code of MALAX is available at https://github.com/omerwe/MALAX . CONTACT: omerw@cs.technion.ac.il or ehalperin@cs.ucla.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Metilación de ADN , Epigenómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Humanos , Cadenas de Markov , Método de Montecarlo , Sulfitos
11.
Artículo en Inglés | MEDLINE | ID: mdl-28149326

RESUMEN

BACKGROUND: Genetic data are known to harbor information about human demographics, and genotyping data are commonly used for capturing ancestry information by leveraging genome-wide differences between populations. In contrast, it is not clear to what extent population structure is captured by whole-genome DNA methylation data. RESULTS: We demonstrate, using three large-cohort 450K methylation array data sets, that ancestry information signal is mirrored in genome-wide DNA methylation data and that it can be further isolated more effectively by leveraging the correlation structure of CpGs with cis-located SNPs. Based on these insights, we propose a method, EPISTRUCTURE, for the inference of ancestry from methylation data, without the need for genotype data. CONCLUSIONS: EPISTRUCTURE can be used to infer ancestry information of individuals based on their methylation data in the absence of corresponding genetic data. Although genetic data are often collected in epigenetic studies of large cohorts, these are typically not made publicly available, making the application of EPISTRUCTURE especially useful for anyone working on public data. Implementation of EPISTRUCTURE is available in GLINT, our recently released toolset for DNA methylation analysis at: http://glint-epigenetics.readthedocs.io.


Asunto(s)
Epigenómica/métodos , Interfaz Usuario-Computador , Algoritmos , Islas de CpG , Metilación de ADN , Bases de Datos Genéticas , Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Internet , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal
12.
Bioinformatics ; 33(12): 1870-1872, 2017 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-28177067

RESUMEN

SUMMARY: GLINT is a user-friendly command-line toolset for fast analysis of genome-wide DNA methylation data generated using the Illumina human methylation arrays. GLINT, which does not require any programming proficiency, allows an easy execution of Epigenome-Wide Association Study analysis pipeline under different models while accounting for known confounders in methylation data. AVAILABILITY AND IMPLEMENTATION: GLINT is a command-line software, freely available at https://github.com/cozygene/glint/releases . It requires Python 2.7 and several freely available Python packages. Further information and documentation as well as a quick start tutorial are available at http://glint-epigenetics.readthedocs.io . CONTACT: elior.rahmani@gmail.com or ehalperin@cs.ucla.edu.


Asunto(s)
Metilación de ADN , Epigenómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Genoma Humano , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
13.
Am J Hum Genet ; 98(6): 1181-1192, 2016 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-27259052

RESUMEN

Estimation of heritability is fundamental in genetic studies. Recently, heritability estimation using linear mixed models (LMMs) has gained popularity because these estimates can be obtained from unrelated individuals collected in genome-wide association studies. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. Existing methods for the construction of confidence intervals and estimators of SEs for REML rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals. Here, we show that the estimation of confidence intervals by state-of-the-art methods is inaccurate, especially when the true heritability is relatively low or relatively high. We further show that these inaccuracies occur in datasets including thousands of individuals. Such biases are present, for example, in estimates of heritability of gene expression in the Genotype-Tissue Expression project and of lipid profiles in the Ludwigshafen Risk and Cardiovascular Health study. We also show that often the probability that the genetic component is estimated as 0 is high even when the true heritability is bounded away from 0, emphasizing the need for accurate confidence intervals. We propose a computationally efficient method, ALBI (accurate LMM-based heritability bootstrap confidence intervals), for estimating the distribution of the heritability estimator and for constructing accurate confidence intervals. Our method can be used as an add-on to existing methods for estimating heritability and variance components, such as GCTA, FaST-LMM, GEMMA, or EMMAX.


Asunto(s)
Enfermedades Cardiovasculares/genética , Intervalos de Confianza , Interacción Gen-Ambiente , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Carácter Cuantitativo Heredable , Simulación por Computador , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Modelos Genéticos , Modelos Estadísticos
14.
Bioinformatics ; 27(13): i142-8, 2011 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-21685063

RESUMEN

MOTIVATION: Much of the large-scale molecular data from living cells can be represented in terms of networks. Such networks occupy a central position in cellular systems biology. In the protein-protein interaction (PPI) network, nodes represent proteins and edges represent connections between them, based on experimental evidence. As PPI networks are rich and complex, a mathematical model is sought to capture their properties and shed light on PPI evolution. The mathematical literature contains various generative models of random graphs. It is a major, still largely open question, which of these models (if any) can properly reproduce various biologically interesting networks. Here, we consider this problem where the graph at hand is the PPI network of Saccharomyces cerevisiae. We are trying to distinguishing between a model family which performs a process of copying neighbors, represented by the duplication-divergence (DD) model, and models which do not copy neighbors, with the Barabási-Albert (BA) preferential attachment model as a leading example. RESULTS: The observed property of the network is the distribution of maximal bicliques in the graph. This is a novel criterion to distinguish between models in this area. It is particularly appropriate for this purpose, since it reflects the graph's growth pattern under either model. This test clearly favors the DD model. In particular, for the BA model, the vast majority (92.9%) of the bicliques with both sides ≥4 must be already embedded in the model's seed graph, whereas the corresponding figure for the DD model is only 5.1%. Our results, based on the biclique perspective, conclusively show that a naïve unmodified DD model can capture a key aspect of PPI networks.


Asunto(s)
Modelos Estadísticos , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo , Mapeo de Interacción de Proteínas , Biología de Sistemas
15.
Nucleic Acids Res ; 38(Web Server issue): W84-9, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20444873

RESUMEN

Derivation of biological meaning from large sets of proteins or genes is a frequent task in genomic and proteomic studies. Such sets often arise from experimental methods including large-scale gene expression experiments and mass spectrometry (MS) proteomics. Large sets of genes or proteins are also the outcome of computational methods such as BLAST search and homology-based classifications. We have developed the PANDORA web server, which functions as a platform for the advanced biological analysis of sets of genes, proteins, or proteolytic peptides. First, the input set is mapped to a set of corresponding proteins. Then, an analysis of the protein set produces a graph-based hierarchy which highlights intrinsic relations amongst biological subsets, in light of their different annotations from multiple annotation resources. PANDORA integrates a large collection of annotation sources (GO, UniProt Keywords, InterPro, Enzyme, SCOP, CATH, Gene-3D, NCBI taxonomy and more) that comprise approximately 200,000 different annotation terms associated with approximately 3.2 million sequences from UniProtKB. Statistical enrichment based on a binomial approximation of the hypergeometric distribution and corrected for multiple hypothesis tests is calculated using several background sets, including major gene-expression DNA-chip platforms. Users can also visualize either standard or user-defined binary and quantitative properties alongside the proteins. PANDORA 4.2 is available at http://www.pandora.cs.huji.ac.il.


Asunto(s)
Péptidos/química , Péptidos/metabolismo , Proteínas/química , Proteínas/metabolismo , Programas Informáticos , Animales , Interpretación Estadística de Datos , Bases de Datos de Proteínas , Humanos , Internet , Espectrometría de Masas , Ratones , Péptidos/fisiología , Proteínas/fisiología , Proteómica , Ratas , Integración de Sistemas , Interfaz Usuario-Computador
16.
Biol Direct ; 5: 6, 2010 Jan 26.
Artículo en Inglés | MEDLINE | ID: mdl-20100358

RESUMEN

BACKGROUND: Phosphorylation is the most prevalent post-translational modification on eukaryotic proteins. Multisite phosphorylation enables a specific combination of phosphosites to determine the speed, specificity and duration of biological response. Until recent years, the lack of high quality data limited the possibility for analyzing the properties of phosphorylation at the proteome scale and in the context of a wide range of conditions. Thanks to advances of mass spectrometry technologies, thousands of phosphosites from in-vivo experiments were identified and archived in the public domain. Such resource is appropriate to derive an unbiased view on the phosphosites properties in eukaryotes and on their functional relevance. RESULTS: We present statistically rigorous tests on the spatial and functional properties of a collection of approximately 70,000 reported phosphosites. We show that the distribution of phosphosites positioning along the protein tends to occur as dense clusters of Serine/Threonines (pS/pT) and between Serine/Threonines and Tyrosines, but generally not as much between Tyrosines (pY) only. This phenomenon is more ubiquitous than anticipated and is pertinent for most eukaryotic proteins: for proteins with > or = 2 phosphosites, 54% of all pS/pT sites are within 4 amino acids of another site. We found a strong tendency for clustered pS/pT to be activated by the same kinase. Large-scale analyses of phosphopeptides are thus consistent with a cooperative function within the cluster. CONCLUSIONS: We present evidence supporting the notion that clusters of pS/pT but generally not pY should be considered as the elementary building blocks in phosphorylation regulation. Indeed, closely positioned sites tend to be activated by the same kinase, a signal that overrides the tendency of a protein to be activated by a single or only few kinases. Within these clusters, coordination and positional dependency is evident. We postulate that cellular regulation takes advantage of such design. Specifically, phosphosite clusters may increase the robustness of the effectiveness of phosphorylation-dependent response. REVIEWERS: Reviewed by Joel Bader, Frank Eisenhaber, Emmanuel Levy (nominated by Sarah Teichmann). For the full reviews, please go to the Reviewers' comments section.


Asunto(s)
Proteómica/métodos , Aminoácidos/química , Animales , Sitios de Unión , Biología Computacional/métodos , Humanos , Espectrometría de Masas/métodos , Modelos Biológicos , Modelos Estadísticos , Fosfoproteínas/química , Fosforilación , Procesamiento Proteico-Postraduccional , Estructura Secundaria de Proteína , Proteoma , Tirosina/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...