Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Cancers (Basel) ; 12(2)2020 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-32050665

RESUMO

The authors wish to make the following corrections to this paper [1]: The authors would like to replace Table 3 in [1]. The corrections are correcting typographical errors when translating our database in BIC format to HGVS nomenclature, and removing four carriers which had zero follow-up time. [...].

2.
Cancers (Basel) ; 11(2)2019 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-30678073

RESUMO

Background: We have previously demonstrated that the Norwegian frequent pathogenic BRCA1 (path_BRCA1) variants are caused by genetic drift and recurrent de-novo mutations. We here examined the penetrance of frequent path_BRCA1 variants in fertile ages as a surrogate marker for fitness. Material and methods: We conducted an observational prospective study of penetrance for cancer in Norwegian female carriers of frequent path_BRCA1 variants, and compared our observed results to penetrance of infrequent path_BRCA1 variants and to average penetrance of path_BRCA1 variants reported by others. Results: The cumulative risk for breast cancer at 45 years in carriers of frequent path_BRCA1 variants was 20% (94% confidence interval 10⁻30%), compared to 35% (95% confidence interval 22⁻48%) in carriers of infrequent path_BRCA1 variants (p = 0.02), and to the 35% (confidence interval 32⁻39%) average for path_BRCA1 carriers reported by others (p = 0.0001). Discussion and conclusion: Carriers of the most frequent Norwegian path_BRCA1 variants had low incidence of cancer in fertile ages, indicating a low selective disadvantage. This, together with the variant locations being hotspots for de novo mutations and subject to genetic drift, as previously described, may have caused their high prevalence today. Besides being of theoretical interest to explain the phenomenon that a few path_BRCA1 variants are frequent, the later onset of breast cancer associated with the most frequent path_BRCA1 variants may be of interest for carriers who have to decide if and when to select prophylactic mastectomy.

3.
Gut ; 67(7): 1306-1316, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-28754778

RESUMO

BACKGROUND: Most patients with path_MMR gene variants (Lynch syndrome (LS)) now survive both their first and subsequent cancers, resulting in a growing number of older patients with LS for whom limited information exists with respect to cancer risk and survival. OBJECTIVE AND DESIGN: This observational, international, multicentre study aimed to determine prospectively observed incidences of cancers and survival in path_MMR carriers up to 75 years of age. RESULTS: 3119 patients were followed for a total of 24 475 years. Cumulative incidences at 75 years (risks) for colorectal cancer were 46%, 43% and 15% in path_MLH1, path_MSH2 and path_MSH6 carriers; for endometrial cancer 43%, 57% and 46%; for ovarian cancer 10%, 17% and 13%; for upper gastrointestinal (gastric, duodenal, bile duct or pancreatic) cancers 21%, 10% and 7%; for urinary tract cancers 8%, 25% and 11%; for prostate cancer 17%, 32% and 18%; and for brain tumours 1%, 5% and 1%, respectively. Ovarian cancer occurred mainly premenopausally. By contrast, upper gastrointestinal, urinary tract and prostate cancers occurred predominantly at older ages. Overall 5-year survival for prostate cancer was 100%, urinary bladder 93%, ureter 85%, duodenum 67%, stomach 61%, bile duct 29%, brain 22% and pancreas 0%. Path_PMS2 carriers had lower risk for cancer. CONCLUSION: Carriers of different path_MMR variants exhibit distinct patterns of cancer risk and survival as they age. Risk estimates for counselling and planning of surveillance and treatment should be tailored to each patient's age, gender and path_MMR variant. We have updated our open-access website www.lscarisk.org to facilitate this.


Assuntos
Neoplasias do Colo/epidemiologia , Neoplasias Colorretais Hereditárias sem Polipose/complicações , Neoplasias Colorretais Hereditárias sem Polipose/mortalidade , Neoplasias Pancreáticas/epidemiologia , Neoplasias Urogenitais/epidemiologia , Fatores Etários , Idoso , Neoplasias Colorretais Hereditárias sem Polipose/patologia , Bases de Dados Factuais , Feminino , Humanos , Incidência , Masculino , Estudos Prospectivos
4.
Oncotarget ; 8(44): 76290-76304, 2017 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-29100312

RESUMO

Background: Metastatic colorectal cancer (CRC) is associated with highly variable clinical outcome and response to therapy. The recently identified consensus molecular subtypes (CMS1-4) have prognostic and therapeutic implications in primary CRC, but whether these subtypes are valid for metastatic disease is unclear. We performed multi-level analyses of resectable CRC liver metastases (CLM) to identify molecular characteristics of metastatic disease and evaluate the clinical relevance. Methods: In this ancillary study to the Oslo-CoMet trial, CLM and tumor-adjacent liver tissue from 46 patients were analyzed by profiling mutations (targeted sequencing), genome-wide copy number alteration (CNAs), and gene expression. Results: Somatic mutations and CNAs detected in CLM were similar to reported primary CRC profiles, while CNA profiles of eight metastatic pairs suggested intra-patient divergence. A CMS classifier tool applied to gene expression data, revealed the cohort to be highly enriched for CMS2. Hierarchical clustering of genes with highly variable expression identified two subgroups separated by high or low expression of 55 genes with immune-related and metabolic functions. Importantly, induction of genes and pathways associated with immunogenic cell death (ICD) was identified in metastases exposed to neoadjuvant chemotherapy (NACT). Conclusions: The uniform classification of CLM by CMS subtyping may indicate that novel class discovery approaches need to be explored to uncover clinically useful stratification of CLM. Detected gene expression signatures support the role of metabolism and chemotherapy in shaping the immune microenvironment of CLM. Furthermore, the results point to rational exploration of immune modulating strategies in CLM, particularly by exploiting NACT-induced ICD.

5.
Artigo em Inglês | MEDLINE | ID: mdl-29046738

RESUMO

BACKGROUND: We have previously reported a high incidence of colorectal cancer (CRC) in carriers of pathogenic MLH1 variants (path_MLH1) despite follow-up with colonoscopy including polypectomy. METHODS: The cohort included Finnish carriers enrolled in 3-yearly colonoscopy (n = 505; 4625 observation years) and carriers from other countries enrolled in colonoscopy 2-yearly or more frequently (n = 439; 3299 observation years). We examined whether the longer interval between colonoscopies in Finland could explain the high incidence of CRC and whether disease expression correlated with differences in population CRC incidence. RESULTS: Cumulative CRC incidences in carriers of path_MLH1 at 70-years of age were 41% for males and 36% for females in the Finnish series and 58% and 55% in the non-Finnish series, respectively (p > 0.05). Mean time from last colonoscopy to CRC was 32.7 months in the Finnish compared to 31.0 months in the non-Finnish (p > 0.05) and was therefore unaffected by the recommended colonoscopy interval. Differences in population incidence of CRC could not explain the lower point estimates for CRC in the Finnish series. Ten-year overall survival after CRC was similar for the Finnish and non-Finnish series (88% and 91%, respectively; p > 0.05). CONCLUSIONS: The hypothesis that the high incidence of CRC in path_MLH1 carriers was caused by a higher incidence in the Finnish series was not valid. We discuss whether the results were influenced by methodological shortcomings in our study or whether the assumption that a shorter interval between colonoscopies leads to a lower CRC incidence may be wrong. This second possibility is intriguing, because it suggests the dogma that CRC in path_MLH1 carriers develops from polyps that can be detected at colonoscopy and removed to prevent CRC may be erroneous. In view of the excellent 10-year overall survival in the Finnish and non-Finnish series we remain strong advocates of current surveillance practices for those with LS pending studies that will inform new recommendations on the best surveillance interval.

6.
Biostatistics ; 18(3): 586-587, 2017 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-28334081
7.
Gut ; 66(3): 464-472, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-26657901

RESUMO

OBJECTIVE: Estimates of cancer risk and the effects of surveillance in Lynch syndrome have been subject to bias, partly through reliance on retrospective studies. We sought to establish more robust estimates in patients undergoing prospective cancer surveillance. DESIGN: We undertook a multicentre study of patients carrying Lynch syndrome-associated mutations affecting MLH1, MSH2, MSH6 or PMS2. Standardised information on surveillance, cancers and outcomes were collated in an Oracle relational database and analysed by age, sex and mutated gene. RESULTS: 1942 mutation carriers without previous cancer had follow-up including colonoscopic surveillance for 13 782 observation years. 314 patients developed cancer, mostly colorectal (n=151), endometrial (n=72) and ovarian (n=19). Cancers were detected from 25 years onwards in MLH1 and MSH2 mutation carriers, and from about 40 years in MSH6 and PMS2 carriers. Among first cancer detected in each patient the colorectal cancer cumulative incidences at 70 years by gene were 46%, 35%, 20% and 10% for MLH1, MSH2, MSH6 and PMS2 mutation carriers, respectively. The equivalent cumulative incidences for endometrial cancer were 34%, 51%, 49% and 24%; and for ovarian cancer 11%, 15%, 0% and 0%. Ten-year crude survival was 87% after any cancer, 91% if the first cancer was colorectal, 98% if endometrial and 89% if ovarian. CONCLUSIONS: The four Lynch syndrome-associated genes had different penetrance and expression. Colorectal cancer occurred frequently despite colonoscopic surveillance but resulted in few deaths. Using our data, a website has been established at http://LScarisk.org enabling calculation of cumulative cancer risks as an aid to genetic counselling in Lynch syndrome.


Assuntos
Neoplasias Colorretais Hereditárias sem Polipose/epidemiologia , Neoplasias Colorretais Hereditárias sem Polipose/genética , Neoplasias do Endométrio/epidemiologia , Neoplasias Ovarianas/epidemiologia , Vigilância da População , Adolescente , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Criança , Colonoscopia , Neoplasias Colorretais Hereditárias sem Polipose/diagnóstico por imagem , Neoplasias Colorretais Hereditárias sem Polipose/mortalidade , Proteínas de Ligação a DNA/genética , Bases de Dados Factuais , Neoplasias do Endométrio/mortalidade , Feminino , Expressão Gênica , Heterozigoto , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Endonuclease PMS2 de Reparo de Erro de Pareamento/genética , Proteína 1 Homóloga a MutL/genética , Proteína 2 Homóloga a MutS/genética , Neoplasias Ovarianas/mortalidade , Estudos Prospectivos , Taxa de Sobrevida , Adulto Jovem
8.
Gut ; 66(9): 1657-1664, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-27261338

RESUMO

OBJECTIVE: Today most patients with Lynch syndrome (LS) survive their first cancer. There is limited information on the incidences and outcome of subsequent cancers. The present study addresses three questions: (i) what is the cumulative incidence of a subsequent cancer; (ii) in which organs do subsequent cancers occur; and (iii) what is the survival following these cancers? DESIGN: Information was collated on prospectively organised surveillance and prospectively observed outcomes in patients with LS who had cancer prior to inclusion and analysed by age, gender and genetic variants. RESULTS: 1273 patients with LS from 10 countries were followed up for 7753 observation years. 318 patients (25.7%) developed 341 first subsequent cancers, including colorectal (n=147, 43%), upper GI, pancreas or bile duct (n=37, 11%) and urinary tract (n=32, 10%). The cumulative incidences for any subsequent cancer from age 40 to age 70 years were 73% for pathogenic MLH1 (path_MLH1), 76% for path_MSH2 carriers and 52% for path_MSH6 carriers, and for colorectal cancer (CRC) the cumulative incidences were 46%, 48% and 23%, respectively. Crude survival after any subsequent cancer was 82% (95% CI 76% to 87%) and 10-year crude survival after CRC was 91% (95% CI 83% to 95%). CONCLUSIONS: Relative incidence of subsequent cancer compared with incidence of first cancer was slightly but insignificantly higher than cancer incidence in patients with LS without previous cancer (range 0.94-1.49). The favourable survival after subsequent cancers validated continued follow-up to prevent death from cancer. The interactive website http://lscarisk.org was expanded to calculate the risks by gender, genetic variant and age for subsequent cancer for any patient with LS with previous cancer.


Assuntos
Neoplasias do Colo , Neoplasias Colorretais Hereditárias sem Polipose , Proteínas de Ligação a DNA/genética , Proteína 1 Homóloga a MutL/genética , Proteína 2 Homóloga a MutS/genética , Adulto , Idoso , Neoplasias do Colo/genética , Neoplasias do Colo/patologia , Neoplasias Colorretais Hereditárias sem Polipose/epidemiologia , Neoplasias Colorretais Hereditárias sem Polipose/genética , Neoplasias Colorretais Hereditárias sem Polipose/patologia , Reparo de Erro de Pareamento de DNA/genética , Progressão da Doença , Europa (Continente)/epidemiologia , Feminino , Variação Genética , Mutação em Linhagem Germinativa , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Medição de Risco/métodos , Medição de Risco/estatística & dados numéricos , Análise de Sobrevida
9.
Biostatistics ; 17(1): 29-39, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26272994

RESUMO

Removal of, or adjustment for, batch effects or center differences is generally required when such effects are present in data. In particular, when preparing microarray gene expression data from multiple cohorts, array platforms, or batches for later analyses, batch effects can have confounding effects, inducing spurious differences between study groups. Many methods and tools exist for removing batch effects from data. However, when study groups are not evenly distributed across batches, actual group differences may induce apparent batch differences, in which case batch adjustments may bias, usually deflate, group differences. Some tools therefore have the option of preserving the difference between study groups, e.g. using a two-way ANOVA model to simultaneously estimate both group and batch effects. Unfortunately, this approach may systematically induce incorrect group differences in downstream analyses when groups are distributed between the batches in an unbalanced manner. The scientific community seems to be largely unaware of how this approach may lead to false discoveries.


Assuntos
Interpretação Estatística de Dados , Análise em Microsséries/normas , Humanos , Análise em Microsséries/métodos , Reprodutibilidade dos Testes
10.
Breast Cancer Res ; 17: 29, 2015 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-25849221

RESUMO

INTRODUCTION: Breast cancer is commonly classified into intrinsic molecular subtypes. Standard gene centering is routinely done prior to molecular subtyping, but it can produce inaccurate classifications when the distribution of clinicopathological characteristics in the study cohort differs from that of the training cohort used to derive the classifier. METHODS: We propose a subgroup-specific gene-centering method to perform molecular subtyping on a study cohort that has a skewed distribution of clinicopathological characteristics relative to the training cohort. On such a study cohort, we center each gene on a specified percentile, where the percentile is determined from a subgroup of the training cohort with clinicopathological characteristics similar to the study cohort. We demonstrate our method using the PAM50 classifier and its associated University of North Carolina (UNC) training cohort. We considered study cohorts with skewed clinicopathological characteristics, including subgroups composed of a single prototypic subtype of the UNC-PAM50 training cohort (n = 139), an external estrogen receptor (ER)-positive cohort (n = 48) and an external triple-negative cohort (n = 77). RESULTS: Subgroup-specific gene centering improved prediction performance with the accuracies between 77% and 100%, compared to accuracies between 17% and 33% from standard gene centering, when applied to the prototypic tumor subsets of the PAM50 training cohort. It reduced classification error rates on the ER-positive (11% versus 28%; P = 0.0389), the ER-negative (5% versus 41%; P < 0.0001) and the triple-negative (11% versus 56%; P = 0.1336) subgroups of the PAM50 training cohort. In addition, it produced higher accuracy for subtyping study cohorts composed of varying proportions of ER-positive versus ER-negative cases. Finally, it increased the percentage of assigned luminal subtypes on the external ER-positive cohort and basal-like subtype on the external triple-negative cohort. CONCLUSIONS: Gene centering is often necessary to accurately apply a molecular subtype classifier. Compared with standard gene centering, our proposed subgroup-specific gene centering produced more accurate molecular subtype assignments in a study cohort with skewed clinicopathological characteristics relative to the training cohort.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Perfilação da Expressão Gênica , Tipagem Molecular , Estudos de Coortes , Conjuntos de Dados como Assunto , Feminino , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Humanos , Tipagem Molecular/métodos , Prognóstico , Receptores de Estrogênio/genética
11.
BMC Cancer ; 14: 211, 2014 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-24645668

RESUMO

BACKGROUND: The aim was to assess and compare prognostic power of nine breast cancer gene signatures (Intrinsic, PAM50, 70-gene, 76-gene, Genomic-Grade-Index, 21-gene-Recurrence-Score, EndoPredict, Wound-Response and Hypoxia) in relation to ER status and follow-up time. METHODS: A gene expression dataset from 947 breast tumors was used to evaluate the signatures for prediction of Distant Metastasis Free Survival (DMFS). A total of 912 patients had available DMFS status. The recently published METABRIC cohort was used as an additional validation set. RESULTS: Survival predictions were fairly concordant across most signatures. Prognostic power declined with follow-up time. During the first 5 years of followup, all signatures except for Hypoxia were predictive for DMFS in ER-positive disease, and 76-gene, Hypoxia and Wound-Response were prognostic in ER-negative disease. After 5 years, the signatures had little prognostic power. Gene signatures provide significant prognostic information beyond tumor size, node status and histological grade. CONCLUSIONS: Generally, these signatures performed better for ER-positive disease, indicating that risk within each ER stratum is driven by distinct underlying biology. Most of the signatures were strong risk predictors for DMFS during the first 5 years of follow-up. Combining gene signatures with histological grade or tumor size, could improve the prognostic power, perhaps also of long-term survival.


Assuntos
Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Receptores de Estrogênio/genética , Neoplasias da Mama/mortalidade , Estudos de Coortes , Feminino , Seguimentos , Humanos , Prognóstico , Receptores de Estrogênio/biossíntese , Reprodutibilidade dos Testes , Taxa de Sobrevida/tendências , Fatores de Tempo
12.
BMC Bioinformatics ; 14: 313, 2013 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-24152242

RESUMO

BACKGROUND: Processing of reads from high throughput sequencing is often done in terms of edges in the de Bruijn graph representing all k-mers from the reads. The memory requirements for storing all k-mers in a lookup table can be demanding, even after removal of read errors, but can be alleviated by using a memory efficient data structure. RESULTS: The FM-index, which is based on the Burrows-Wheeler transform, provides an efficient data structure providing a searchable index of all substrings from a set of strings, and is used to compactly represent full genomes for use in mapping reads to a genome: the memory required to store this is in the same order of magnitude as the strings themselves. However, reads from high throughput sequences mostly have high coverage and so contain the same substrings multiple times from different reads. I here present a modification of the FM-index, which I call the kFM-index, for indexing the set of k-mers from the reads. For DNA sequences, this requires 5 bit of information for each vertex of the corresponding de Bruijn subgraph, i.e. for each different k-1-mer, plus some additional overhead, typically 0.5 to 1 bit per vertex, for storing the equivalent of the FM-index for walking the underlying de Bruijn graph and reproducing the actual k-mers efficiently. CONCLUSIONS: The kFM-index could replace more memory demanding data structures for storing the de Bruijn k-mer graph representation of sequence reads. A Java implementation with additional technical documentation is provided which demonstrates the applicability of the data structure (http://folk.uio.no/einarro/Projects/KFM-index/).


Assuntos
Algoritmos , Genômica/métodos , Análise de Sequência de DNA/métodos
13.
Mol Cell Proteomics ; 12(6): 1723-34, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23438732

RESUMO

Protein complexes enact most biochemical functions in the cell. Dynamic interactions between protein complexes are frequent in many cellular processes. As they are often of a transient nature, they may be difficult to detect using current genome-wide screens. Here, we describe a method to computationally predict physical interactions between protein complexes, applied to both humans and yeast. We integrated manually curated protein complexes and physical protein interaction networks, and we designed a statistical method to identify pairs of protein complexes where the number of protein interactions between a complex pair is due to an actual physical interaction between the complexes. An evaluation against manually curated physical complex-complex interactions in yeast revealed that 50% of these interactions could be predicted in this manner. A community network analysis of the highest scoring pairs revealed a biologically sensible organization of physical complex-complex interactions in the cell. Such analyses of proteomes may serve as a guide to the discovery of novel functional cellular relationships.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/estatística & dados numéricos , Mapas de Interação de Proteínas , Proteoma/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Bases de Dados de Proteínas , Humanos , Funções Verossimilhança , Ligação Proteica , Multimerização Proteica , Saccharomyces cerevisiae/química
14.
PLoS One ; 6(3): e17845, 2011 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-21423775

RESUMO

BACKGROUND: Several gene sets for prediction of breast cancer survival have been derived from whole-genome mRNA expression profiles. Here, we develop a statistical framework to explore whether combination of the information from such sets may improve prediction of recurrence and breast cancer specific death in early-stage breast cancers. Microarray data from two clinically similar cohorts of breast cancer patients are used as training (n = 123) and test set (n = 81), respectively. Gene sets from eleven previously published gene signatures are included in the study. PRINCIPAL FINDINGS: To investigate the relationship between breast cancer survival and gene expression on a particular gene set, a Cox proportional hazards model is applied using partial likelihood regression with an L2 penalty to avoid overfitting and using cross-validation to determine the penalty weight. The fitted models are applied to an independent test set to obtain a predicted risk for each individual and each gene set. Hierarchical clustering of the test individuals on the basis of the vector of predicted risks results in two clusters with distinct clinical characteristics in terms of the distribution of molecular subtypes, ER, PR status, TP53 mutation status and histological grade category, and associated with significantly different survival probabilities (recurrence: p = 0.005; breast cancer death: p = 0.014). Finally, principal components analysis of the gene signatures is used to derive combined predictors used to fit a new Cox model. This model classifies test individuals into two risk groups with distinct survival characteristics (recurrence: p = 0.003; breast cancer death: p = 0.001). The latter classifier outperforms all the individual gene signatures, as well as Cox models based on traditional clinical parameters and the Adjuvant! Online for survival prediction. CONCLUSION: Combining the predictive strength of multiple gene signatures improves prediction of breast cancer survival. The presented methodology is broadly applicable to breast cancer risk assessment using any new identified gene set.


Assuntos
Neoplasias da Mama/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Neoplasias da Mama/diagnóstico , Mapeamento Cromossômico , Análise por Conglomerados , Determinação de Ponto Final , Feminino , Genes Neoplásicos/genética , Humanos , Estimativa de Kaplan-Meier , Modelos Genéticos , Análise Multivariada , Análise de Componente Principal , Prognóstico , Modelos de Riscos Proporcionais , Fatores de Risco
15.
J Mol Evol ; 70(3): 266-74, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20213140

RESUMO

The question of whether natural selection favors genetic stability or genetic variability is a fundamental problem in evolutionary biology. Bioinformatic analyses demonstrate that selection favors genetic stability by avoiding unstable nucleotide sequences in protein encoding DNA. Yet, such unstable sequences are maintained in several DNA repair genes, thereby promoting breakdown of repair and destabilizing the genome. Several studies have therefore argued that selection favors genetic variability at the expense of stability. Here we propose a new evolutionary mechanism, with supporting bioinformatic evidence, that resolves this paradox. Combining the concepts of gene-dependent mutation biases and meiotic recombination, we argue that unstable sequences in the DNA mismatch repair (MMR) genes are maintained by their own phenotype. In particular, we predict that human MMR maintains an overrepresentation of mononucleotide repeats (monorepeats) within and around the MMR genes. In support of this hypothesis, we report a 31% excess in monorepeats in 250 kb regions surrounding the seven MMR genes compared to all other RefSeq genes (1.75 vs. 1.34%, P = 0.0047), with a particularly high content in PMS2 (2.41%, P = 0.0047) and MSH6 (2.07%, P = 0.043). Based on a mathematical model of monorepeat frequency, we argue that the proposed mechanism may suffice to explain the observed excess of repeats around MMR genes. Our findings thus indicate that unstable sequences in MMR genes are maintained through evolution by the MMR mechanism. The evolutionary paradox of genetically unstable DNA repair genes may thus be explained by an equilibrium in which the phenotype acts back on its own genotype.


Assuntos
Sequência de Bases/fisiologia , Reparo do DNA/genética , Variação Genética/fisiologia , Instabilidade Genômica/fisiologia , Evolução Molecular , Frequência do Gene , Genes/fisiologia , Humanos , Modelos Biológicos , Modelos Genéticos , Modelos Teóricos , Fenótipo , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA
16.
Bioinformatics ; 25(8): 996-1003, 2009 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-19244388

RESUMO

MOTIVATION: Helix-helix interactions play a critical role in the structure assembly, stability and function of membrane proteins. On the molecular level, the interactions are mediated by one or more residue contacts. Although previous studies focused on helix-packing patterns and sequence motifs, few of them developed methods specifically for contact prediction. RESULTS: We present a new hierarchical framework for contact prediction, with an application in membrane proteins. The hierarchical scheme consists of two levels: in the first level, contact residues are predicted from the sequence and their pairing relationships are further predicted in the second level. Statistical analyses on contact propensities are combined with other sequence and structural information for training the support vector machine classifiers. Evaluated on 52 protein chains using leave-one-out cross validation (LOOCV) and an independent test set of 14 protein chains, the two-level approach consistently improves the conventional direct approach in prediction accuracy, with 80% reduction of input for prediction. Furthermore, the predicted contacts are then used to infer interactions between pairs of helices. When at least three predicted contacts are required for an inferred interaction, the accuracy, sensitivity and specificity are 56%, 40% and 89%, respectively. Our results demonstrate that a hierarchical framework can be applied to eliminate false positives (FP) while reducing computational complexity in predicting contacts. Together with the estimated contact propensities, this method can be used to gain insights into helix-packing in membrane proteins.


Assuntos
Biologia Computacional/métodos , Proteínas de Membrana/química , Bases de Dados de Proteínas , Proteínas de Membrana/metabolismo , Modelos Biológicos , Estrutura Secundária de Proteína , Reprodutibilidade dos Testes
17.
Nucleic Acids Res ; 35(9): 3100-8, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17452365

RESUMO

The publication of a complete genome sequence is usually accompanied by annotations of its genes. In contrast to protein coding genes, genes for ribosomal RNA (rRNA) are often poorly or inconsistently annotated. This makes comparative studies based on rRNA genes difficult. We have therefore created computational predictors for the major rRNA species from all kingdoms of life and compiled them into a program called RNAmmer. The program uses hidden Markov models trained on data from the 5S ribosomal RNA database and the European ribosomal RNA database project. A pre-screening step makes the method fast with little loss of sensitivity, enabling the analysis of a complete bacterial genome in less than a minute. Results from running RNAmmer on a large set of genomes indicate that the location of rRNAs can be predicted with a very high level of accuracy. Novel, unannotated rRNAs are also predicted in many genomes. The software as well as the genome analysis results are available at the CBS web server.


Assuntos
Genes de RNAr , Software , Biologia Computacional/métodos , Genoma Bacteriano , Genômica/métodos , Cadeias de Markov
18.
FEMS Immunol Med Microbiol ; 49(2): 243-51, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17284282

RESUMO

Neisseria meningitidis, or the meningococcus, is the source of significant morbidity and mortality in humans worldwide. Even though mutability has been linked to the occurrence of outbreaks of epidemic disease, meningococcal DNA repair pathways are poorly delineated. For the first time, a collection of meningococcal disease-associated isolates has been demonstrated to express constitutively the DNA glycosylases MutY and Fpg in vivo. DNA sequence analysis showed considerable variability in the deduced amino acid sequences of MutS and Fpg, while MutY and RecA were highly conserved. Interestingly, multi-locus sequence typing demonstrated a putative link between the pattern of amino acid substitutions and levels of spontaneous mutagenicity in meningococcal strains. These results provide a basis for further studies aimed at resolving the genotype/phenotype relationships of meningococcal genome variability and mutator activity.


Assuntos
Proteínas de Bactérias/genética , Reparo do DNA/genética , Neisseria meningitidis/genética , Antibacterianos/farmacologia , Sequência de Bases , DNA Glicosilases/genética , DNA Bacteriano/química , DNA Bacteriano/genética , DNA-Formamidopirimidina Glicosilase/genética , DNA-Formamidopirimidina Glicosilase/metabolismo , Farmacorresistência Bacteriana , Humanos , Peróxido de Hidrogênio/farmacologia , Viabilidade Microbiana , Dados de Sequência Molecular , Proteína MutS de Ligação de DNA com Erro de Pareamento/genética , Mutação , Neisseria meningitidis/efeitos dos fármacos , Neisseria meningitidis/fisiologia , Estresse Oxidativo , Filogenia , Polimorfismo Genético , Recombinases Rec A/genética , Rifampina/farmacologia , Homologia de Sequência de Aminoácidos
19.
J Comput Biol ; 13(6): 1197-213, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16901237

RESUMO

A number of non-coding RNA are known to contain functionally important or conserved pseudoknots. However, pseudoknotted structures are more complex than orthodox, and most methods for analyzing secondary structures do not handle them. I present here a way to decompose and represent general secondary structures which extends the tree representation of the stem-loop structure, and use this to analyze the frequency of pseudoknots in known and in random secondary structures. This comparison shows that, though a number of pseudoknots exist, they are still relatively rare and mostly of the simpler kinds. In contrast, random secondary structures tend to be heavily knotted, and the number of available structures increases dramatically when allowing pseudoknots. Therefore, methods for structure prediction and non-coding RNA identification that allow pseudoknots are likely to be much less powerful than those that do not, unless they penalize pseudoknots appropriately.


Assuntos
Modelos Moleculares , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Análise de Sequência de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...