Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Int J Cancer ; 155(5): 934-945, 2024 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-38709956

RESUMEN

We analyzed variations in the epidermal growth factor receptor (EGFR) gene and 5'-upstream region to identify potential molecular predictors of treatment response in primary epithelial ovarian cancer. Tumor tissues collected during debulking surgery from the prospective multicenter OVCAD study were investigated. Copy number variations in the human endogenous retrovirus sequence human endogenous retrovirus K9 (HERVK9) and EGFR Exons 7 and 9, as well as repeat length and loss of heterozygosity of polymorphic CA-SSR I and relative EGFR mRNA expression were determined quantitatively. At least one EGFR variation was observed in 94% of the patients. Among the 30 combinations of variations discovered, enhanced platinum sensitivity (n = 151) was found dominantly with HERVK9 haploidy and Exon 7 tetraploidy, overrepresented among patients with survival ≥120 months (24/29, p = .0212). EGFR overexpression (≥80 percentile) was significantly less likely in the responders (17% vs. 32%, p = .044). Multivariate Cox regression analysis, including age, FIGO stage, and grade, indicated that the patients' subgroup was prognostically significant for CA-SSR I repeat length <18 CA for both alleles (HR 0.276, 95% confidence interval 0.109-0.655, p = .001). Although EGFR variations occur in ovarian cancer, the mRNA levels remain low compared to other EGFR-mutated cancers. Notably, the inherited length of the CA-SSR I repeat, HERVK9 haploidy, and Exon 7 tetraploidy conferred three times higher odds ratio to survive for more than 10 years under therapy. This may add value in guiding therapies if determined during follow-up in circulating tumor cells or circulating tumor DNA and offers HERVK9 as a potential therapeutic target.


Asunto(s)
Cromosomas Humanos Par 7 , Variaciones en el Número de Copia de ADN , Receptores ErbB , Neoplasias Ováricas , Humanos , Femenino , Receptores ErbB/genética , Neoplasias Ováricas/genética , Neoplasias Ováricas/mortalidad , Neoplasias Ováricas/patología , Neoplasias Ováricas/tratamiento farmacológico , Persona de Mediana Edad , Cromosomas Humanos Par 7/genética , Estudios Prospectivos , Anciano , Carcinoma Epitelial de Ovario/genética , Carcinoma Epitelial de Ovario/mortalidad , Carcinoma Epitelial de Ovario/patología , Adulto , Retroelementos/genética , Fenotipo , Resistencia a Antineoplásicos/genética , Retrovirus Endógenos/genética , Pérdida de Heterocigocidad
2.
Nucleic Acids Res ; 47(1): 341-361, 2019 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-30357366

RESUMEN

The RNA-binding protein TDP-43 is heavily implicated in neurodegenerative disease. Numerous patient mutations in TARDBP, the gene encoding TDP-43, combined with data from animal and cell-based models, imply that altered RNA regulation by TDP-43 causes Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. However, underlying mechanisms remain unresolved. Increased cytoplasmic TDP-43 levels in diseased neurons suggest a possible role in this cellular compartment. Here, we examined the impact on translation of overexpressing human TDP-43 and the TDP-43A315T patient mutant protein in motor neuron-like cells and primary cultures of cortical neurons. In motor-neuron like cells, TDP-43 associates with ribosomes without significantly affecting global translation. However, ribosome profiling and additional assays revealed enhanced translation and direct binding of Camta1, Mig12, and Dennd4a mRNAs. Overexpressing either wild-type TDP-43 or TDP-43A315T stimulated translation of Camta1 and Mig12 mRNAs via their 5'UTRs and increased CAMTA1 and MIG12 protein levels. In contrast, translational enhancement of Dennd4a mRNA required a specific 3'UTR region and was specifically observed with the TDP-43A315T patient mutant allele. Our data reveal that TDP-43 can function as an mRNA-specific translational enhancer. Moreover, since CAMTA1 and DENND4A are linked to neurodegeneration, they suggest that this function could contribute to disease.


Asunto(s)
Proteínas de Unión al Calcio/genética , Proteínas de Unión al ADN/genética , Enfermedades Neurodegenerativas/genética , Transactivadores/genética , Esclerosis Amiotrófica Lateral/genética , Esclerosis Amiotrófica Lateral/patología , Animales , Citoplasma/genética , Citoplasma/metabolismo , Demencia Frontotemporal/genética , Demencia Frontotemporal/patología , Regulación de la Expresión Génica/genética , Humanos , Ratones , Proteínas Asociadas a Microtúbulos/genética , Neuronas Motoras/metabolismo , Neuronas Motoras/patología , Mutación , Enfermedades Neurodegenerativas/patología , Cultivo Primario de Células , ARN Mensajero/genética , Ribosomas/genética
3.
Bioinformatics ; 35(16): 2853-2855, 2019 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-30596893

RESUMEN

SUMMARY: The graphical fragment assembly (GFA) formats are emerging standard formats for the representation of sequence graphs. Although GFA 1 was primarily targeting assembly graphs, the newer GFA 2 format introduces several features, which makes it suitable for representing other kinds of information, such as scaffolding graphs, variation graphs, alignment graphs and colored metagenomic graphs. Here, we present GfaViz, an interactive graphical tool for the visualization of sequence graphs in GFA format. The software supports all new features of GFA 2 and introduces conventions for their visualization. The user can choose between two different layouts and multiple styles for representing single elements or groups. All customizations can be stored in custom tags of the GFA format itself, without requiring external configuration files. Stylesheets are supported for storing standard configuration options for groups of files. The visualizations can be exported to raster and vector graphics formats. A command line interface allows for batch generation of images. AVAILABILITY AND IMPLEMENTATION: GfaViz is available at https://github.com/ggonnella/gfaviz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Metagenoma , Análisis de Secuencia
4.
Bioinformatics ; 33(19): 3094-3095, 2017 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-28645150

RESUMEN

SUMMARY: GFA 1 and GFA 2 are recently defined formats for representing sequence graphs, such as assembly, variation or splicing graphs. The formats are adopted by several software tools. Here, we present GfaPy, a software package for creating, parsing and editing GFA graphs using the programming language Python. GfaPy supports GFA 1 and GFA 2, using the same interface and allows for interconversion between both formats. The software package provides a simple interface for custom record types, which is an important new feature of GFA 2 (compared to GFA 1). This enables new applications of the format. AVAILABILITY AND IMPLEMENTATION: GfaPy is available open source at https://github.com/ggonnella/gfapy and installable via pip. CONTACT: gonnella@zbh.uni-hamburg.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de Secuencia/métodos , Programas Informáticos , Gráficos por Computador , Lenguajes de Programación
5.
Appl Environ Microbiol ; 80(15): 4585-98, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24837379

RESUMEN

The active venting Sisters Peak (SP) chimney on the Mid-Atlantic Ridge holds the current temperature record for the hottest ever measured hydrothermal fluids (400°C, accompanied by sudden temperature bursts reaching 464°C). Given the unprecedented temperature regime, we investigated the biome of this chimney with a focus on special microbial adaptations for thermal tolerance. The SP metagenome reveals considerable differences in the taxonomic composition from those of other hydrothermal vent and subsurface samples; these could be better explained by temperature than by other available abiotic parameters. The most common species to which SP genes were assigned were thermophilic Aciduliprofundum sp. strain MAR08-339 (11.8%), Hippea maritima (3.8%), Caldisericum exile (1.5%), and Caminibacter mediatlanticus (1.4%) as well as to the mesophilic Niastella koreensis (2.8%). A statistical analysis of associations between taxonomic and functional gene assignments revealed specific overrepresented functional categories: for Aciduliprofundum, protein biosynthesis, nucleotide metabolism, and energy metabolism genes; for Hippea and Caminibacter, cell motility and/or DNA replication and repair system genes; and for Niastella, cell wall and membrane biogenesis genes. Cultured representatives of these organisms inhabit different thermal niches; i.e., Aciduliprofundum has an optimal growth temperature of 70°C, Hippea and Caminibacter have optimal growth temperatures around 55°C, and Niastella grows between 10 and 37°C. Therefore, we posit that the different enrichment profiles of functional categories reflect distinct microbial strategies to deal with the different impacts of the local sudden temperature bursts in disparate regions of the chimney.


Asunto(s)
Bacterias/aislamiento & purificación , Agua de Mar/microbiología , Bacterias/clasificación , Bacterias/genética , Bacterias/crecimiento & desarrollo , Calor , Datos de Secuencia Molecular , Filogenia , Agua de Mar/química
6.
J Pathol ; 231(1): 130-41, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23794398

RESUMEN

Deletion of 3p13 has been reported from about 20% of prostate cancers. The clinical significance of this alteration and the tumour suppressor gene(s) driving the deletion remain to be identified. We have mapped the 3p13 deletion locus using SNP array analysis and performed fluorescence in situ hybridization (FISH) analysis to search for associations between 3p13 deletion, prostate cancer phenotype and patient prognosis in a tissue microarray containing more than 3200 prostate cancers. SNP array analysis of 72 prostate cancers revealed a small deletion at 3p13 in 14 (19%) of the tumours, including the putative tumour suppressors FOXP1, RYBP and SHQ1. FISH analysis using FOXP1-specific probes revealed deletions in 16.5% and translocations in 1.2% of 1828 interpretable cancers. 3p13 deletions were linked to adverse features of prostate cancer, including advanced stage (p < 0.0001), high Gleason grade (p = 0.0125), and early PSA recurrence (p = 0.0015). In addition, 3p13 deletions were linked to ERG(+) cancers and to PTEN deletions (p < 0.0001 each). A subset analysis of ERG(+) tumours revealed that 3p13 deletions occurred independently from PTEN deletions (p = 0.3126), identifying tumours with 3p13 deletion as a distinct molecular subset of ERG(+) cancers. mRNA expression analysis confirmed that all 3p13 genes were down regulated by the deletion. Ectopic over-expression of FOXP1, RYBP and SHQ1 resulted in decreased colony-formation capabilities, corroborating a tumour suppressor function for all three genes. In summary, our data show that deletion of 3p13 defines a distinct and aggressive molecular subset of ERG(+) prostate cancers, which is possibly driven by inactivation of multiple tumour suppressors.


Asunto(s)
Adenocarcinoma/genética , Deleción Cromosómica , Cromosomas Humanos Par 3/genética , Genes Supresores de Tumor , Neoplasias de la Próstata/genética , Adenocarcinoma/metabolismo , Adenocarcinoma/mortalidad , Adenocarcinoma/patología , Línea Celular Tumoral , Factores de Transcripción Forkhead/genética , Factores de Transcripción Forkhead/metabolismo , Perfilación de la Expresión Génica , Técnicas de Silenciamiento del Gen , Alemania/epidemiología , Humanos , Estimación de Kaplan-Meier , Masculino , Recurrencia Local de Neoplasia , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas de Fusión Oncogénica/metabolismo , Polimorfismo de Nucleótido Simple , Próstata/metabolismo , Próstata/patología , Prostatectomía , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/mortalidad , Neoplasias de la Próstata/patología , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Análisis de Matrices Tisulares
7.
BMC Bioinformatics ; 14: 226, 2013 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-23865810

RESUMEN

BACKGROUND: It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. RESULTS: We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. CONCLUSIONS: The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator.


Asunto(s)
Algoritmos , Secuencia de Bases , Análisis de Secuencia de ARN , Emparejamiento Base , Secuencia de Bases/genética , Biología Computacional/métodos , Simulación por Computador , Bases de Datos Factuales , ARN/química , ARN/genética , Alineación de Secuencia , Análisis de Secuencia de ARN/métodos , Programas Informáticos
8.
Environ Microbiol ; 15(5): 1551-60, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23171403

RESUMEN

We present data on the co-registered geochemistry (in situ mass spectrometry) and microbiology (pyrosequencing of 16S rRNA genes; V1, V2, V3 regions) in five fluid samples from Irina II in the Logatchev hydrothermal field. Two samples were collected over 24 min from the same spot and further three samples were from spatially distinct locations (20 cm, 3 m and the overlaying plume). Four low-temperature hydrothermal fluids from the Irina II are composed of the same core bacterial community, namely specific Gammaproteobacteria and Epsilonproteobacteria, which, however, differs in the relative abundance. The microbial composition of the fifth sample (plume) is considerably different. Although a significant correlation between sulfide enrichment and proportions of Sulfurovum (Epsilonproteobacteria) was found, no other significant linkages between abiotic factors, i.e. temperature, hydrogen, methane, sulfide and oxygen, and bacterial lineages were evident. Intriguingly, bacterial community compositions of some time series samples from the same spot were significantly more similar to a sample collected 20 cm away than to each other. Although this finding is based on three single samples only, it provides first hints that single hydrothermal fluid samples collected on a small spatial scale may also reflect unrecognized temporal variability. However, further studies are required to support this hypothesis.


Asunto(s)
Biodiversidad , Respiraderos Hidrotermales/química , Respiraderos Hidrotermales/microbiología , Agua de Mar/química , Agua de Mar/microbiología , Concentración de Iones de Hidrógeno , Magnesio/análisis , Oxígeno/análisis , Proteobacteria/genética , Proteobacteria/aislamiento & purificación , ARN Ribosómico 16S/genética , Temperatura , Factores de Tiempo
9.
Am J Pathol ; 181(2): 401-12, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22705054

RESUMEN

The phosphatase and tensin homolog deleted on chromosome 10 (PTEN) gene is often altered in prostate cancer. To determine the prevalence and clinical significance of the different mechanisms of PTEN inactivation, we analyzed PTEN deletions in TMAs containing 4699 hormone-naïve and 57 hormone-refractory prostate cancers using fluorescence in situ hybridization analysis. PTEN mutations and methylation were analyzed in subsets of 149 and 34 tumors, respectively. PTEN deletions were present in 20.2% (458/2266) of prostate cancers, including 8.1% heterozygous and 12.1% homozygous deletions, and were linked to advanced tumor stage (P < 0.0001), high Gleason grade (P < 0.0001), presence of lymph node metastasis (P = 0.0002), hormone-refractory disease (P < 0.0001), presence of ERG gene fusion (P < 0.0001), and nuclear p53 accumulation (P < 0.0001). PTEN deletions were also associated with early prostate-specific antigen recurrence in univariate (P < 0.0001) and multivariate (P = 0.0158) analyses. The prognostic impact of PTEN deletion was seen in both ERG fusion-positive and ERG fusion-negative tumors. PTEN mutations were found in 4 (12.9%) of 31 cancers with heterozygous PTEN deletions but in only 1 (2%) of 59 cancers without PTEN deletion (P = 0.027). Aberrant PTEN promoter methylation was not detected in 34 tumors. The results of this study demonstrate that biallelic PTEN inactivation, by either homozygous deletion or deletion of one allele and mutation of the other, occurs in most PTEN-defective cancers and characterizes a particularly aggressive subset of metastatic and hormone-refractory prostate cancers.


Asunto(s)
Eliminación de Gen , Proteínas de Fusión Oncogénica/metabolismo , Fosfohidrolasa PTEN/genética , Antígeno Prostático Específico/metabolismo , Neoplasias de la Próstata/enzimología , Neoplasias de la Próstata/patología , Transactivadores/metabolismo , Anciano , Biomarcadores de Tumor/metabolismo , Cromosomas Humanos Par 10/genética , Metilación de ADN/genética , Análisis Mutacional de ADN , Progresión de la Enfermedad , Epigénesis Genética , Genoma Humano/genética , Humanos , Inmunohistoquímica , Masculino , Persona de Mediana Edad , Análisis Multivariante , Fosfohidrolasa PTEN/metabolismo , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Regiones Promotoras Genéticas/genética , Modelos de Riesgos Proporcionales , Recurrencia , Regulador Transcripcional ERG , Proteína p53 Supresora de Tumor/metabolismo
10.
BMC Bioinformatics ; 13: 82, 2012 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-22559072

RESUMEN

BACKGROUND: Ongoing improvements in throughput of the next-generation sequencing technologies challenge the current generation of de novo sequence assemblers. Most recent sequence assemblers are based on the construction of a de Bruijn graph. An alternative framework of growing interest is the assembly string graph, not necessitating a division of the reads into k-mers, but requiring fast algorithms for the computation of suffix-prefix matches among all pairs of reads. RESULTS: Here we present efficient methods for the construction of a string graph from a set of sequencing reads. Our approach employs suffix sorting and scanning methods to compute suffix-prefix matches. Transitive edges are recognized and eliminated early in the process and the graph is efficiently constructed including irreducible edges only. CONCLUSIONS: Our suffix-prefix match determination and string graph construction algorithms have been implemented in the software package Readjoiner. Comparison with existing string graph-based assemblers shows that Readjoiner is faster and more space efficient. Readjoiner is available at http://www.zbh.uni-hamburg.de/readjoiner.


Asunto(s)
Programas Informáticos , Algoritmos , Simulación por Computador , Genoma Humano/genética , Humanos , Modelos Genéticos , Análisis de Secuencia de ADN/métodos
11.
BMC Bioinformatics ; 12: 214, 2011 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-21619640

RESUMEN

BACKGROUND: The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. RESULTS: We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. CONCLUSIONS: The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator.


Asunto(s)
Algoritmos , ARN/química , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Secuencia de Bases , Conformación de Ácido Nucleico , ARN/genética
12.
Nucleic Acids Res ; 37(21): 7002-13, 2009 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-19786494

RESUMEN

Long terminal repeat (LTR) retrotransposons and endogenous retroviruses (ERVs) are transposable elements in eukaryotic genomes well suited for computational identification. De novo identification tools determine the position of potential LTR retrotransposon or ERV insertions in genomic sequences. For further analysis, it is desirable to obtain an annotation of the internal structure of such candidates. This article presents LTRdigest, a novel software tool for automated annotation of internal features of putative LTR retrotransposons. It uses local alignment and hidden Markov model-based algorithms to detect retrotransposon-associated protein domains as well as primer binding sites and polypurine tracts. As an example, we used LTRdigest results to identify 88 (near) full-length ERVs in the chromosome 4 sequence of Mus musculus, separating them from truncated insertions and other repeats. Furthermore, we propose a work flow for the use of LTRdigest in de novo LTR retrotransposon classification and perform an exemplary de novo analysis on the Drosophila melanogaster genome as a proof of concept. Using a new method solely based on the annotations generated by LTRdigest, 518 potential LTR retrotransposons were automatically assigned to 62 candidate groups. Representative sequences from 41 of these 62 groups were matched to reference sequences with >80% global sequence similarity.


Asunto(s)
Retroelementos , Programas Informáticos , Secuencias Repetidas Terminales , Animales , Cromosomas de los Mamíferos , Clasificación/métodos , Drosophila melanogaster/genética , Retrovirus Endógenos/genética , Genoma de los Insectos , Genómica , Ratones
13.
Genes Chromosomes Cancer ; 49(1): 1-8, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19787783

RESUMEN

Recently, amplification of PPFIA1, encoding a member of the liprin family located about 600 kb telomeric to CCND1 on chromosome band 11q13, was described in squamous cell carcinoma of head and neck. Because 11q13 amplification is frequent in breast cancer, and PPFIA1 has been suggested to contribute to mammary gland development, we hypothesized that PPFIA1 might also be involved in the 11q13 amplicon in breast cancer and contribute to breast cancer development. A tissue microarray containing more than 2000 human breast cancers was analyzed for gene copy numbers of PPFIA1 and CCND1 by means of fluorescence in situ hybridization. PPFIA1 amplification was found in 248/1583 (15.4%) of breast cancers. Coamplification with CCND1 was found in all (248/248, 100%) PPFIA1-amplified cancers. CCND1 amplification without PPFIA1 coamplification was found in additional 117 (4.7%) tumors. Amplification of both PPFIA1 and CCND1 were significantly associated with high-grade phenotype (P = 0.0002) but were unrelated to tumor stage (P = 0.7066) or nodal stage (P = 0.5807). No difference in patient prognosis was found between 248 CCND1/PPFIA1 coamplified tumors and 117 tumors with CCND1 amplification alone (P = 0.6419). These data show that PPFIA1 amplification occurs frequently in breast cancer. The higher incidence of CCND1 amplification when compared with PPFIA1, the lack of prognostic relevance of coamplifications, and the fact that PPFIA1 amplification was found exclusively in CCND1-amplified cancers suggest that PPFIA1 gene copy number changes represent concurrent events of CCND1 amplification rather than specific biological incidents.


Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/genética , Neoplasias de la Mama/genética , Ciclina D1/genética , Amplificación de Genes , Adulto , Anciano , Anciano de 80 o más Años , Neoplasias de la Mama/patología , Cromosomas Humanos Par 11 , Femenino , Dosificación de Gen , Humanos , Incidencia , Persona de Mediana Edad , Fenotipo , Pronóstico , Análisis de Matrices Tisulares
14.
Algorithms Mol Biol ; 16(1): 20, 2021 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-34425870

RESUMEN

BACKGROUND: Repetitive elements contribute a large part of eukaryotic genomes. For example, about 40 to 50% of human, mouse and rat genomes are repetitive. So identifying and classifying repeats is an important step in genome annotation. This annotation step is traditionally performed using alignment based methods, either in a de novo approach or by aligning the genome sequence to a species specific set of repetitive sequences. Recently, Li (Bioinformatics 35:4408-4410, 2019) developed a novel software tool dna-brnn to annotate repetitive sequences using a recurrent neural network trained on sample annotations of repetitive elements. RESULTS: We have developed the methods of dna-brnn further and engineered a new software tool DeepGRP. This combines the basic concepts of Li (Bioinformatics 35:4408-4410, 2019) with current techniques developed for neural machine translation, the attention mechanism, for the task of nucleotide-level annotation of repetitive elements. An evaluation on the human genome shows a 20% improvement of the Matthews correlation coefficient for the predictions delivered by DeepGRP, when compared to dna-brnn. DeepGRP predicts two additional classes of repeats (compared to dna-brnn) and is able to transfer repeat annotations, using RepeatMasker-based training data to a different species (mouse). Additionally, we could show that DeepGRP predicts repeats annotated in the Dfam database, but not annotated by RepeatMasker. DeepGRP is highly scalable due to its implementation in the TensorFlow framework. For example, the GPU-accelerated version of DeepGRP is approx. 1.8 times faster than dna-brnn, approx. 8.6 times faster than RepeatMasker and over 100 times faster than HMMER searching for models of the Dfam database. CONCLUSIONS: By incorporating methods from neural machine translation, DeepGRP achieves a consistent improvement of the quality of the predictions compared to dna-brnn. Improved running times are obtained by employing TensorFlow as implementation framework and the use of GPUs. By incorporating two additional classes of repeats, DeepGRP provides more complete annotations, which were evaluated against three state-of-the-art tools for repeat annotation.

15.
BMC Genomics ; 11: 335, 2010 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-20507619

RESUMEN

BACKGROUND: The Mongolian gerbils are a good model to mimic the Helicobacter pylori-associated pathogenesis of the human stomach. In the current study the gerbil-adapted strain B8 was completely sequenced, annotated and compared to previous genomes, including the 73 supercontigs of the parental strain B128. RESULTS: The complete genome of H. pylori B8 was manually curated gene by gene, to assign as much function as possible. It consists of a circular chromosome of 1,673,997 bp and of a small plasmid of 6,032 bp carrying nine putative genes. The chromosome contains 1,711 coding sequences, 293 of which are strain-specific, coding mainly for hypothetical proteins, and a large plasticity zone containing a putative type-IV-secretion system and coding sequences with unknown function. The cag-pathogenicity island is rearranged such that the cagA-gene is located 13,730 bp downstream of the inverted gene cluster cagB-cag1. Directly adjacent to the cagA-gene, there are four hypothetical genes and one variable gene with a different codon usage compared to the rest of the H. pylori B8-genome. This indicates that these coding sequences might be acquired via horizontal gene transfer.The genome comparison of strain B8 to its parental strain B128 delivers 425 unique B8-proteins. Due to the fact that strain B128 was not fully sequenced and only automatically annotated, only 12 of these proteins are definitive singletons that might have been acquired during the gerbil-adaptation process of strain B128. CONCLUSION: Our sequence data and its analysis provide new insight into the high genetic diversity of H. pylori-strains. We have shown that the gerbil-adapted strain B8 has the potential to build, possibly by a high rate of mutation and recombination, a dynamic pool of genetic variants (e.g. fragmented genes and repetitive regions) required for the adaptation-processes. We hypothesize that these variants are essential for the colonization and persistence of strain B8 in the gerbil stomach during in ammation.


Asunto(s)
Adaptación Fisiológica , Genómica/métodos , Gerbillinae/microbiología , Helicobacter pylori/genética , Helicobacter pylori/fisiología , Análisis de Secuencia de ADN/métodos , Animales , Antígenos Bacterianos/genética , Proteínas Bacterianas/genética , Codón/genética , Variación Genética , Genoma Bacteriano/genética , Humanos , Plásmidos/genética , Proteoma/genética , Especificidad de la Especie , Estómago/microbiología
16.
Bioinformatics ; 25(24): 3251-8, 2009 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-19828575

RESUMEN

MOTIVATION: Profile hidden Markov models (pHMMs) are currently the most popular modeling concept for protein families. They provide sensitive family descriptors, and sequence database searching with pHMMs has become a standard task in today's genome annotation pipelines. On the downside, searching with pHMMs is computationally expensive. RESULTS: We propose a new method for efficient protein family classification and for speeding up database searches with pHMMs as is necessary for large-scale analysis scenarios. We employ simpler models of protein families called position-specific scoring matrices family models (PSSM-FMs). For fast database search, we combine full-text indexing, efficient exact p-value computation of PSSM match scores and fast fragment chaining. The resulting method is well suited to prefilter the set of sequences to be searched for subsequent database searches with pHMMs. We achieved a classification performance only marginally inferior to hmmsearch, yet, results could be obtained in a fraction of runtime with a speedup of >64-fold. In experiments addressing the method's ability to prefilter the sequence space for subsequent database searches with pHMMs, our method reduces the number of sequences to be searched with hmmsearch to only 0.80% of all sequences. The filter is very fast and leads to a total speedup of factor 43 over the unfiltered search, while retaining >99.5% of the original results. In a lossless filter setup for hmmsearch on UniProtKB/Swiss-Prot, we observed a speedup of factor 92. AVAILABILITY: The presented algorithms are implemented in the program PoSSuMsearch2, available for download at http://bibiserv.techfak.uni-bielefeld.de/possumsearch2/. CONTACT: beckstette@zbh.uni-hamburg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Cadenas de Markov , Posición Específica de Matrices de Puntuación , Proteínas/clasificación , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodos , Programas Informáticos
17.
Bioinformatics ; 25(4): 533-4, 2009 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-19106120

RESUMEN

SUMMARY: To analyse the vast amount of genome annotation data available today, a visual representation of genomic features in a given sequence range is required. We developed a C library which provides layout and drawing capabilities for annotation features. It supports several common input and output formats and can easily be integrated into custom C applications. To exemplify the use of AnnotationSketch in other languages, we provide bindings to the scripting languages Ruby, Python and Lua. AVAILABILITY: The software is available under an open-source license as part of GenomeTools (http://genometools.org/annotationsketch.html).


Asunto(s)
Genoma , Programas Informáticos , Gráficos por Computador , Bases de Datos Factuales , Perfilación de la Expresión Génica/métodos , Lenguajes de Programación , Interfaz Usuario-Computador
18.
BMC Cancer ; 10: 78, 2010 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-20199686

RESUMEN

BACKGROUND: Increased transcription of oncogenes like the epidermal growth factor receptor (EGFR) is frequently caused by amplification of the whole gene or at least of regulatory sequences. Aim of this study was to pinpoint mechanistic parameters occurring during egfr copy number gains leading to a stable EGFR overexpression and high sensitivity to extracellular signalling. A deeper understanding of those marker events might improve early diagnosis of cancer in suspect lesions, early detection of cancer progression and the prediction of egfr targeted therapies. METHODS: The basal-like/stemness type breast cancer cell line subpopulation MDA-MB-468 CD44high/CD24-/low, carrying high egfr amplifications, was chosen as a model system in this study. Subclones of the heterogeneous cell line expressing low and high EGF receptor densities were isolated by cell sorting. Genomic profiling was carried out for these by means of SNP array profiling, qPCR and FISH. Cell cycle analysis was performed using the BrdU quenching technique. RESULTS: Low and high EGFR expressing MDA-MB-468 CD44+/CD24-/low subpopulations separated by cell sorting showed intermediate and high copy numbers of egfr, respectively. However, during cell culture an increase solely for egfr gene copy numbers in the intermediate subpopulation occurred. This shift was based on the formation of new cells which regained egfr gene copies. By two parametric cell cycle analysis clonal effects mediated through growth advantage of cells bearing higher egfr gene copy numbers could most likely be excluded for being the driving force. Subsequently, the detection of a fragile site distal to the egfr gene, sustaining uncapped telomere-less chromosomal ends, the ladder-like structure of the intrachromosomal egfr amplification and a broader range of egfr copy numbers support the assumption that dynamic chromosomal rearrangements, like breakage-fusion-bridge-cycles other than proliferation drive the gain of egfr copies. CONCLUSION: Progressive genome modulation in the CD44+/CD24-/low subpopulation of the breast cancer cell line MDA-MB-468 leads to different coexisting subclones. In isolated low-copy cells asymmetric chromosomal segregation leads to new cells with regained solely egfr gene copies. Furthermore, egfr regain resulted in enhanced signal transduction of the MAP-kinase and PI3-kinase pathway. We show here for the first time a dynamic copy number regain in basal-like/stemness cell type breast cancer subpopulations which might explain genetic heterogeneity. Moreover, this process might also be involved in adaptive growth factor receptor intracellular signaling which support survival and migration during cancer development and progression.


Asunto(s)
Neoplasias de la Mama/metabolismo , Antígeno CD24/biosíntesis , Receptores ErbB/genética , Receptores de Hialuranos/biosíntesis , Ciclo Celular , Línea Celular Tumoral , Femenino , Citometría de Flujo/métodos , Dosificación de Gen , Perfilación de la Expresión Génica , Variación Genética , Humanos , Cinética , Polimorfismo de Nucleótido Simple , Transducción de Señal
19.
PLoS Comput Biol ; 5(9): e1000502, 2009 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-19750212

RESUMEN

With few exceptions, current methods for short read mapping make use of simple seed heuristics to speed up the search. Most of the underlying matching models neglect the necessity to allow not only mismatches, but also insertions and deletions. Current evaluations indicate, however, that very different error models apply to the novel high-throughput sequencing methods. While the most frequent error-type in Illumina reads are mismatches, reads produced by 454's GS FLX predominantly contain insertions and deletions (indels). Even though 454 sequencers are able to produce longer reads, the method is frequently applied to small RNA (miRNA and siRNA) sequencing. Fast and accurate matching in particular of short reads with diverse errors is therefore a pressing practical problem. We introduce a matching model for short reads that can, besides mismatches, also cope with indels. It addresses different error models. For example, it can handle the problem of leading and trailing contaminations caused by primers and poly-A tails in transcriptomics or the length-dependent increase of error rates. In these contexts, it thus simplifies the tedious and error-prone trimming step. For efficient searches, our method utilizes index structures in the form of enhanced suffix arrays. In a comparison with current methods for short read mapping, the presented approach shows significantly increased performance not only for 454 reads, but also for Illumina reads. Our approach is implemented in the software segemehl available at http://www.bioinf.uni-leipzig.de/Software/segemehl/.


Asunto(s)
Biología Computacional/métodos , Análisis Mutacional de ADN/métodos , Mutación , Algoritmos , Secuencia de Bases , Alineación de Secuencia
20.
Front Microbiol ; 10: 2296, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31649639

RESUMEN

The microbial community composition and its functionality was assessed for hydrothermal fluids and volcanic ash sediments from Haungaroa and hydrothermal fluids from the Brothers volcano in the Kermadec island arc (New Zealand). The Haungaroa volcanic ash sediments were dominated by epsilonproteobacterial Sulfurovum sp. Ratios of electron donor consumption to CO2 fixation from respective sediment incubations indicated that sulfide oxidation appeared to fuel autotrophic CO2 fixation, coinciding with thermodynamic estimates predicting sulfide oxidation as the major energy source in the environment. Transcript analyses with the sulfide-supplemented sediment slurries demonstrated that Sulfurovum prevailed in the experiments as well. Hence, our sediment incubations appeared to simulate environmental conditions well suggesting that sulfide oxidation catalyzed by Sulfurovum members drive biomass synthesis in the volcanic ash sediments. For the Haungaroa fluids no inorganic electron donor and responsible microorganisms could be identified that clearly stimulated autotrophic CO2 fixation. In the Brothers hydrothermal fluids Sulfurimonas (49%) and Hydrogenovibrio/Thiomicrospira (15%) species prevailed. Respective fluid incubations exhibited highest autotrophic CO2 fixation if supplemented with iron(II) or hydrogen. Likewise catabolic energy calculations predicted primarily iron(II) but also hydrogen oxidation as major energy sources in the natural fluids. According to transcript analyses with material from the incubation experiments Thiomicrospira/Hydrogenovibrio species dominated, outcompeting Sulfurimonas. Given that experimental conditions likely only simulated environmental conditions that cause Thiomicrospira/Hydrogenovibrio but not Sulfurimonas to thrive, it remains unclear which environmental parameters determine Sulfurimonas' dominance in the Brothers natural hydrothermal fluids.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA