Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Int J Cancer ; 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38709956

RESUMO

We analyzed variations in the epidermal growth factor receptor (EGFR) gene and 5'-upstream region to identify potential molecular predictors of treatment response in primary epithelial ovarian cancer. Tumor tissues collected during debulking surgery from the prospective multicenter OVCAD study were investigated. Copy number variations in the human endogenous retrovirus sequence human endogenous retrovirus K9 (HERVK9) and EGFR Exons 7 and 9, as well as repeat length and loss of heterozygosity of polymorphic CA-SSR I and relative EGFR mRNA expression were determined quantitatively. At least one EGFR variation was observed in 94% of the patients. Among the 30 combinations of variations discovered, enhanced platinum sensitivity (n = 151) was found dominantly with HERVK9 haploidy and Exon 7 tetraploidy, overrepresented among patients with survival ≥120 months (24/29, p = .0212). EGFR overexpression (≥80 percentile) was significantly less likely in the responders (17% vs. 32%, p = .044). Multivariate Cox regression analysis, including age, FIGO stage, and grade, indicated that the patients' subgroup was prognostically significant for CA-SSR I repeat length <18 CA for both alleles (HR 0.276, 95% confidence interval 0.109-0.655, p = .001). Although EGFR variations occur in ovarian cancer, the mRNA levels remain low compared to other EGFR-mutated cancers. Notably, the inherited length of the CA-SSR I repeat, HERVK9 haploidy, and Exon 7 tetraploidy conferred three times higher odds ratio to survive for more than 10 years under therapy. This may add value in guiding therapies if determined during follow-up in circulating tumor cells or circulating tumor DNA and offers HERVK9 as a potential therapeutic target.

2.
Algorithms Mol Biol ; 16(1): 20, 2021 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-34425870

RESUMO

BACKGROUND: Repetitive elements contribute a large part of eukaryotic genomes. For example, about 40 to 50% of human, mouse and rat genomes are repetitive. So identifying and classifying repeats is an important step in genome annotation. This annotation step is traditionally performed using alignment based methods, either in a de novo approach or by aligning the genome sequence to a species specific set of repetitive sequences. Recently, Li (Bioinformatics 35:4408-4410, 2019) developed a novel software tool dna-brnn to annotate repetitive sequences using a recurrent neural network trained on sample annotations of repetitive elements. RESULTS: We have developed the methods of dna-brnn further and engineered a new software tool DeepGRP. This combines the basic concepts of Li (Bioinformatics 35:4408-4410, 2019) with current techniques developed for neural machine translation, the attention mechanism, for the task of nucleotide-level annotation of repetitive elements. An evaluation on the human genome shows a 20% improvement of the Matthews correlation coefficient for the predictions delivered by DeepGRP, when compared to dna-brnn. DeepGRP predicts two additional classes of repeats (compared to dna-brnn) and is able to transfer repeat annotations, using RepeatMasker-based training data to a different species (mouse). Additionally, we could show that DeepGRP predicts repeats annotated in the Dfam database, but not annotated by RepeatMasker. DeepGRP is highly scalable due to its implementation in the TensorFlow framework. For example, the GPU-accelerated version of DeepGRP is approx. 1.8 times faster than dna-brnn, approx. 8.6 times faster than RepeatMasker and over 100 times faster than HMMER searching for models of the Dfam database. CONCLUSIONS: By incorporating methods from neural machine translation, DeepGRP achieves a consistent improvement of the quality of the predictions compared to dna-brnn. Improved running times are obtained by employing TensorFlow as implementation framework and the use of GPUs. By incorporating two additional classes of repeats, DeepGRP provides more complete annotations, which were evaluated against three state-of-the-art tools for repeat annotation.

3.
Front Microbiol ; 10: 2296, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31649639

RESUMO

The microbial community composition and its functionality was assessed for hydrothermal fluids and volcanic ash sediments from Haungaroa and hydrothermal fluids from the Brothers volcano in the Kermadec island arc (New Zealand). The Haungaroa volcanic ash sediments were dominated by epsilonproteobacterial Sulfurovum sp. Ratios of electron donor consumption to CO2 fixation from respective sediment incubations indicated that sulfide oxidation appeared to fuel autotrophic CO2 fixation, coinciding with thermodynamic estimates predicting sulfide oxidation as the major energy source in the environment. Transcript analyses with the sulfide-supplemented sediment slurries demonstrated that Sulfurovum prevailed in the experiments as well. Hence, our sediment incubations appeared to simulate environmental conditions well suggesting that sulfide oxidation catalyzed by Sulfurovum members drive biomass synthesis in the volcanic ash sediments. For the Haungaroa fluids no inorganic electron donor and responsible microorganisms could be identified that clearly stimulated autotrophic CO2 fixation. In the Brothers hydrothermal fluids Sulfurimonas (49%) and Hydrogenovibrio/Thiomicrospira (15%) species prevailed. Respective fluid incubations exhibited highest autotrophic CO2 fixation if supplemented with iron(II) or hydrogen. Likewise catabolic energy calculations predicted primarily iron(II) but also hydrogen oxidation as major energy sources in the natural fluids. According to transcript analyses with material from the incubation experiments Thiomicrospira/Hydrogenovibrio species dominated, outcompeting Sulfurimonas. Given that experimental conditions likely only simulated environmental conditions that cause Thiomicrospira/Hydrogenovibrio but not Sulfurimonas to thrive, it remains unclear which environmental parameters determine Sulfurimonas' dominance in the Brothers natural hydrothermal fluids.

4.
Bioinformatics ; 35(16): 2853-2855, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30596893

RESUMO

SUMMARY: The graphical fragment assembly (GFA) formats are emerging standard formats for the representation of sequence graphs. Although GFA 1 was primarily targeting assembly graphs, the newer GFA 2 format introduces several features, which makes it suitable for representing other kinds of information, such as scaffolding graphs, variation graphs, alignment graphs and colored metagenomic graphs. Here, we present GfaViz, an interactive graphical tool for the visualization of sequence graphs in GFA format. The software supports all new features of GFA 2 and introduces conventions for their visualization. The user can choose between two different layouts and multiple styles for representing single elements or groups. All customizations can be stored in custom tags of the GFA format itself, without requiring external configuration files. Stylesheets are supported for storing standard configuration options for groups of files. The visualizations can be exported to raster and vector graphics formats. A command line interface allows for batch generation of images. AVAILABILITY AND IMPLEMENTATION: GfaViz is available at https://github.com/ggonnella/gfaviz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Metagenoma , Análise de Sequência
5.
Nucleic Acids Res ; 47(1): 341-361, 2019 01 10.
Artigo em Inglês | MEDLINE | ID: mdl-30357366

RESUMO

The RNA-binding protein TDP-43 is heavily implicated in neurodegenerative disease. Numerous patient mutations in TARDBP, the gene encoding TDP-43, combined with data from animal and cell-based models, imply that altered RNA regulation by TDP-43 causes Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. However, underlying mechanisms remain unresolved. Increased cytoplasmic TDP-43 levels in diseased neurons suggest a possible role in this cellular compartment. Here, we examined the impact on translation of overexpressing human TDP-43 and the TDP-43A315T patient mutant protein in motor neuron-like cells and primary cultures of cortical neurons. In motor-neuron like cells, TDP-43 associates with ribosomes without significantly affecting global translation. However, ribosome profiling and additional assays revealed enhanced translation and direct binding of Camta1, Mig12, and Dennd4a mRNAs. Overexpressing either wild-type TDP-43 or TDP-43A315T stimulated translation of Camta1 and Mig12 mRNAs via their 5'UTRs and increased CAMTA1 and MIG12 protein levels. In contrast, translational enhancement of Dennd4a mRNA required a specific 3'UTR region and was specifically observed with the TDP-43A315T patient mutant allele. Our data reveal that TDP-43 can function as an mRNA-specific translational enhancer. Moreover, since CAMTA1 and DENND4A are linked to neurodegeneration, they suggest that this function could contribute to disease.


Assuntos
Proteínas de Ligação ao Cálcio/genética , Proteínas de Ligação a DNA/genética , Doenças Neurodegenerativas/genética , Transativadores/genética , Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/patologia , Animais , Citoplasma/genética , Citoplasma/metabolismo , Demência Frontotemporal/genética , Demência Frontotemporal/patologia , Regulação da Expressão Gênica/genética , Humanos , Camundongos , Proteínas Associadas aos Microtúbulos/genética , Neurônios Motores/metabolismo , Neurônios Motores/patologia , Mutação , Doenças Neurodegenerativas/patologia , Cultura Primária de Células , RNA Mensageiro/genética , Ribossomos/genética
6.
Sci Rep ; 8(1): 10386, 2018 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-29991752

RESUMO

To assess the risk that mining of seafloor massive sulfides (SMS) from extinct hydrothermal vent environments has for changing the ecosystem irreversibly, we sampled SMS analogous habitats from the Kairei and the Pelagia vent fields along the Indian Ridge. In total 19.8 million 16S rRNA tags from 14 different sites were analyzed and the microbial communities were compared with each other and with publicly available data sets from other marine environments. The chimneys appear to provide habitats for microorganisms that are not found or only detectable in very low numbers in other marine habitats. The chimneys also host rare organisms and may function as a vital part of the ocean's seed bank. Many of the reads from active and inactive chimney samples were clustered into OTUs, with low or no resemblance to known species. Since we are unaware of the chemical reactions catalyzed by these unknown organisms, the impact of this diversity loss and bio-geo-coupling is hard to predict. Given that chimney structures can be considered SMS analogues, removal of sulfide deposits from the seafloor in the Kairei and Pelagia fields will most likely alter microbial compositions and affect element cycling in the benthic regions and probably beyond.


Assuntos
Ecossistema , Fontes Hidrotermais/microbiologia , Microbiota , Sulfetos/isolamento & purificação , Biodiversidade , Oceano Índico , Microbiota/genética , Mineração/métodos , Oceanos e Mares
7.
Bioinformatics ; 33(19): 3094-3095, 2017 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-28645150

RESUMO

SUMMARY: GFA 1 and GFA 2 are recently defined formats for representing sequence graphs, such as assembly, variation or splicing graphs. The formats are adopted by several software tools. Here, we present GfaPy, a software package for creating, parsing and editing GFA graphs using the programming language Python. GfaPy supports GFA 1 and GFA 2, using the same interface and allows for interconversion between both formats. The software package provides a simple interface for custom record types, which is an important new feature of GFA 2 (compared to GFA 1). This enables new applications of the format. AVAILABILITY AND IMPLEMENTATION: GfaPy is available open source at https://github.com/ggonnella/gfapy and installable via pip. CONTACT: gonnella@zbh.uni-hamburg.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Análise de Sequência/métodos , Software , Gráficos por Computador , Linguagens de Programação
8.
PeerJ ; 4: e2681, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27843717

RESUMO

The "Graphical Fragment Assembly" (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA. Furthermore, we show how the API provided by RGFA can be employed to design complex graph editing algorithms. As an example, we developed a detection algorithm for CRISPRs in a de Bruijn graph. Finally, RGFA can be used for comparing assembly graphs, e.g., to document the changes in a graph after applying a GUI editor. A program, GFAdiff is provided, which compares the information in two graphs, and generate a report or a Ruby script documenting the transformation steps between the graphs.

9.
Nat Microbiol ; 1(8): 16086, 2016 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-27573109

RESUMO

Hydrothermal vent systems host microbial communities among which several microorganisms have been considered endemic to this type of habitat. It is still unclear how these organisms colonize geographically distant hydrothermal environments. Based on 16S rRNA gene sequences, we compare the bacterial communities of sixteen Atlantic hydrothermal vent samples with our own and publicly available global open ocean samples. Analysing sequences obtained from 63 million 16S rRNA genes, the genera we could identify in the open ocean waters contained 99.9% of the vent reads. This suggests that previously observed vent exclusiveness is, in most cases, probably an artefact of lower sequencing depth. These findings are a further step towards elucidating the role of the open ocean as a seed bank. They can explain the predicament of how species expected to be endemic to vent systems are able to colonize geographically distant hydrothermal habitats and contribute to our understanding of whether 'everything is really everywhere'.


Assuntos
Bactérias/classificação , Bactérias/isolamento & purificação , Biodiversidade , Fontes Hidrotermais/microbiologia , Filogeografia , Oceano Atlântico , Bactérias/genética , Análise por Conglomerados , DNA Ribossômico/química , DNA Ribossômico/genética , RNA Ribossômico 16S/genética , Análise de Sequência de DNA
10.
Appl Environ Microbiol ; 80(15): 4585-98, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24837379

RESUMO

The active venting Sisters Peak (SP) chimney on the Mid-Atlantic Ridge holds the current temperature record for the hottest ever measured hydrothermal fluids (400°C, accompanied by sudden temperature bursts reaching 464°C). Given the unprecedented temperature regime, we investigated the biome of this chimney with a focus on special microbial adaptations for thermal tolerance. The SP metagenome reveals considerable differences in the taxonomic composition from those of other hydrothermal vent and subsurface samples; these could be better explained by temperature than by other available abiotic parameters. The most common species to which SP genes were assigned were thermophilic Aciduliprofundum sp. strain MAR08-339 (11.8%), Hippea maritima (3.8%), Caldisericum exile (1.5%), and Caminibacter mediatlanticus (1.4%) as well as to the mesophilic Niastella koreensis (2.8%). A statistical analysis of associations between taxonomic and functional gene assignments revealed specific overrepresented functional categories: for Aciduliprofundum, protein biosynthesis, nucleotide metabolism, and energy metabolism genes; for Hippea and Caminibacter, cell motility and/or DNA replication and repair system genes; and for Niastella, cell wall and membrane biogenesis genes. Cultured representatives of these organisms inhabit different thermal niches; i.e., Aciduliprofundum has an optimal growth temperature of 70°C, Hippea and Caminibacter have optimal growth temperatures around 55°C, and Niastella grows between 10 and 37°C. Therefore, we posit that the different enrichment profiles of functional categories reflect distinct microbial strategies to deal with the different impacts of the local sudden temperature bursts in disparate regions of the chimney.


Assuntos
Bactérias/isolamento & purificação , Água do Mar/microbiologia , Bactérias/classificação , Bactérias/genética , Bactérias/crescimento & desenvolvimento , Temperatura Alta , Dados de Sequência Molecular , Filogenia , Água do Mar/química
11.
J Clin Bioinforma ; 4(1): 5, 2014 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-24684958

RESUMO

BACKGROUND: A comprehensive view on all relevant genomic data is instrumental for understanding the complex patterns of molecular alterations typically found in cancer cells. One of the most effective ways to rapidly obtain an overview of genomic alterations in large amounts of genomic data is the integrative visualization of genomic events. RESULTS: We developed FISH Oracle 2, a web server for the interactive visualization of different kinds of downstream processed genomics data typically available in cancer research. A powerful search interface and a fast visualization engine provide a highly interactive visualization for such data. High quality image export enables the life scientist to easily communicate their results. A comprehensive data administration allows to keep track of the available data sets. We applied FISH Oracle 2 to published data and found evidence that, in colorectal cancer cells, the gene TTC28 may be inactivated in two different ways, a fact that has not been published before. CONCLUSIONS: The interactive nature of FISH Oracle 2 and the possibility to store, select and visualize large amounts of downstream processed data support life scientists in generating hypotheses. The export of high quality images supports explanatory data visualization, simplifying the communication of new biological findings. A FISH Oracle 2 demo server and the software is available at http://www.zbh.uni-hamburg.de/fishoracle.

12.
Artigo em Inglês | MEDLINE | ID: mdl-24091398

RESUMO

Genome annotations are often published as plain text files describing genomic features and their subcomponents by an implicit annotation graph. In this paper, we present the GenomeTools, a convenient and efficient software library and associated software tools for developing bioinformatics software intended to create, process or convert annotation graphs. The GenomeTools strictly follow the annotation graph approach, offering a unified graph-based representation. This gives the developer intuitive and immediate access to genomic features and tools for their manipulation. To process large annotation sets with low memory overhead, we have designed and implemented an efficient pull-based approach for sequential processing of annotations. This allows to handle even the largest annotation sets, such as a complete catalogue of human variations. Our object-oriented C-based software library enables a developer to conveniently implement their own functionality on annotation graphs and to integrate it into larger workflows, simultaneously accessing compressed sequence data if required. The careful C implementation of the GenomeTools does not only ensure a light-weight memory footprint while allowing full sequential as well as random access to the annotation graph, but also facilitates the creation of bindings to a variety of script programming languages (like Python and Ruby) sharing the same interface.


Assuntos
Genômica/métodos , Anotação de Sequência Molecular/métodos , Software , Genoma Humano , Humanos
13.
BMC Bioinformatics ; 14: 226, 2013 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-23865810

RESUMO

BACKGROUND: It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. RESULTS: We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. CONCLUSIONS: The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator.


Assuntos
Algoritmos , Sequência de Bases , Análise de Sequência de RNA , Pareamento de Bases , Sequência de Bases/genética , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados Factuais , RNA/química , RNA/genética , Alinhamento de Sequência , Análise de Sequência de RNA/métodos , Software
14.
J Pathol ; 231(1): 130-41, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23794398

RESUMO

Deletion of 3p13 has been reported from about 20% of prostate cancers. The clinical significance of this alteration and the tumour suppressor gene(s) driving the deletion remain to be identified. We have mapped the 3p13 deletion locus using SNP array analysis and performed fluorescence in situ hybridization (FISH) analysis to search for associations between 3p13 deletion, prostate cancer phenotype and patient prognosis in a tissue microarray containing more than 3200 prostate cancers. SNP array analysis of 72 prostate cancers revealed a small deletion at 3p13 in 14 (19%) of the tumours, including the putative tumour suppressors FOXP1, RYBP and SHQ1. FISH analysis using FOXP1-specific probes revealed deletions in 16.5% and translocations in 1.2% of 1828 interpretable cancers. 3p13 deletions were linked to adverse features of prostate cancer, including advanced stage (p < 0.0001), high Gleason grade (p = 0.0125), and early PSA recurrence (p = 0.0015). In addition, 3p13 deletions were linked to ERG(+) cancers and to PTEN deletions (p < 0.0001 each). A subset analysis of ERG(+) tumours revealed that 3p13 deletions occurred independently from PTEN deletions (p = 0.3126), identifying tumours with 3p13 deletion as a distinct molecular subset of ERG(+) cancers. mRNA expression analysis confirmed that all 3p13 genes were down regulated by the deletion. Ectopic over-expression of FOXP1, RYBP and SHQ1 resulted in decreased colony-formation capabilities, corroborating a tumour suppressor function for all three genes. In summary, our data show that deletion of 3p13 defines a distinct and aggressive molecular subset of ERG(+) prostate cancers, which is possibly driven by inactivation of multiple tumour suppressors.


Assuntos
Adenocarcinoma/genética , Deleção Cromossômica , Cromossomos Humanos Par 3/genética , Genes Supressores de Tumor , Neoplasias da Próstata/genética , Adenocarcinoma/metabolismo , Adenocarcinoma/mortalidade , Adenocarcinoma/patologia , Linhagem Celular Tumoral , Fatores de Transcrição Forkhead/genética , Fatores de Transcrição Forkhead/metabolismo , Perfilação da Expressão Gênica , Técnicas de Silenciamento de Genes , Alemanha/epidemiologia , Humanos , Estimativa de Kaplan-Meier , Masculino , Recidiva Local de Neoplasia , Análise de Sequência com Séries de Oligonucleotídeos , Proteínas de Fusão Oncogênica/metabolismo , Polimorfismo de Nucleotídeo Único , Próstata/metabolismo , Próstata/patologia , Prostatectomia , Neoplasias da Próstata/metabolismo , Neoplasias da Próstata/mortalidade , Neoplasias da Próstata/patologia , Proteínas Repressoras/genética , Proteínas Repressoras/metabolismo , Análise Serial de Tecidos
15.
Environ Microbiol ; 15(5): 1551-60, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23171403

RESUMO

We present data on the co-registered geochemistry (in situ mass spectrometry) and microbiology (pyrosequencing of 16S rRNA genes; V1, V2, V3 regions) in five fluid samples from Irina II in the Logatchev hydrothermal field. Two samples were collected over 24 min from the same spot and further three samples were from spatially distinct locations (20 cm, 3 m and the overlaying plume). Four low-temperature hydrothermal fluids from the Irina II are composed of the same core bacterial community, namely specific Gammaproteobacteria and Epsilonproteobacteria, which, however, differs in the relative abundance. The microbial composition of the fifth sample (plume) is considerably different. Although a significant correlation between sulfide enrichment and proportions of Sulfurovum (Epsilonproteobacteria) was found, no other significant linkages between abiotic factors, i.e. temperature, hydrogen, methane, sulfide and oxygen, and bacterial lineages were evident. Intriguingly, bacterial community compositions of some time series samples from the same spot were significantly more similar to a sample collected 20 cm away than to each other. Although this finding is based on three single samples only, it provides first hints that single hydrothermal fluid samples collected on a small spatial scale may also reflect unrecognized temporal variability. However, further studies are required to support this hypothesis.


Assuntos
Biodiversidade , Fontes Hidrotermais/química , Fontes Hidrotermais/microbiologia , Água do Mar/química , Água do Mar/microbiologia , Concentração de Íons de Hidrogênio , Magnésio/análise , Oxigênio/análise , Proteobactérias/genética , Proteobactérias/isolamento & purificação , RNA Ribossômico 16S/genética , Temperatura , Fatores de Tempo
16.
Mob DNA ; 3(1): 18, 2012 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-23131050

RESUMO

BACKGROUND: Long terminal repeat (LTR) retrotransposons are a class of eukaryotic mobile elements characterized by a distinctive sequence similarity-based structure. Hence they are well suited for computational identification. Current software allows for a comprehensive genome-wide de novo detection of such elements. The obvious next step is the classification of newly detected candidates resulting in (super-)families. Such a de novo classification approach based on sequence-based clustering of transposon features has been proposed before, resulting in a preliminary assignment of candidates to families as a basis for subsequent manual refinement. However, such a classification workflow is typically split across a heterogeneous set of glue scripts and generic software (for example, spreadsheets), making it tedious for a human expert to inspect, curate and export the putative families produced by the workflow. RESULTS: We have developed LTRsift, an interactive graphical software tool for semi-automatic postprocessing of de novo predicted LTR retrotransposon annotations. Its user-friendly interface offers customizable filtering and classification functionality, displaying the putative candidate groups, their members and their internal structure in a hierarchical fashion. To ease manual work, it also supports graphical user interface-driven reassignment, splitting and further annotation of candidates. Export of grouped candidate sets in standard formats is possible. In two case studies, we demonstrate how LTRsift can be employed in the context of a genome-wide LTR retrotransposon survey effort. CONCLUSIONS: LTRsift is a useful and convenient tool for semi-automated classification of newly detected LTR retrotransposons based on their internal features. Its efficient implementation allows for convenient and seamless filtering and classification in an integrated environment. Developed for life scientists, it is helpful in postprocessing and refining the output of software for predicting LTR retrotransposons up to the stage of preparing full-length reference sequence libraries. The LTRsift software is freely available at http://www.zbh.uni-hamburg.de/LTRsift under an open-source license.

17.
Am J Pathol ; 181(2): 401-12, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22705054

RESUMO

The phosphatase and tensin homolog deleted on chromosome 10 (PTEN) gene is often altered in prostate cancer. To determine the prevalence and clinical significance of the different mechanisms of PTEN inactivation, we analyzed PTEN deletions in TMAs containing 4699 hormone-naïve and 57 hormone-refractory prostate cancers using fluorescence in situ hybridization analysis. PTEN mutations and methylation were analyzed in subsets of 149 and 34 tumors, respectively. PTEN deletions were present in 20.2% (458/2266) of prostate cancers, including 8.1% heterozygous and 12.1% homozygous deletions, and were linked to advanced tumor stage (P < 0.0001), high Gleason grade (P < 0.0001), presence of lymph node metastasis (P = 0.0002), hormone-refractory disease (P < 0.0001), presence of ERG gene fusion (P < 0.0001), and nuclear p53 accumulation (P < 0.0001). PTEN deletions were also associated with early prostate-specific antigen recurrence in univariate (P < 0.0001) and multivariate (P = 0.0158) analyses. The prognostic impact of PTEN deletion was seen in both ERG fusion-positive and ERG fusion-negative tumors. PTEN mutations were found in 4 (12.9%) of 31 cancers with heterozygous PTEN deletions but in only 1 (2%) of 59 cancers without PTEN deletion (P = 0.027). Aberrant PTEN promoter methylation was not detected in 34 tumors. The results of this study demonstrate that biallelic PTEN inactivation, by either homozygous deletion or deletion of one allele and mutation of the other, occurs in most PTEN-defective cancers and characterizes a particularly aggressive subset of metastatic and hormone-refractory prostate cancers.


Assuntos
Deleção de Genes , Proteínas de Fusão Oncogênica/metabolismo , PTEN Fosfo-Hidrolase/genética , Antígeno Prostático Específico/metabolismo , Neoplasias da Próstata/enzimologia , Neoplasias da Próstata/patologia , Transativadores/metabolismo , Idoso , Biomarcadores Tumorais/metabolismo , Cromossomos Humanos Par 10/genética , Metilação de DNA/genética , Análise Mutacional de DNA , Progressão da Doença , Epigênese Genética , Genoma Humano/genética , Humanos , Imuno-Histoquímica , Masculino , Pessoa de Meia-Idade , Análise Multivariada , PTEN Fosfo-Hidrolase/metabolismo , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Regiões Promotoras Genéticas/genética , Modelos de Riscos Proporcionais , Recidiva , Regulador Transcricional ERG , Proteína Supressora de Tumor p53/metabolismo
18.
BMC Bioinformatics ; 13: 82, 2012 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-22559072

RESUMO

BACKGROUND: Ongoing improvements in throughput of the next-generation sequencing technologies challenge the current generation of de novo sequence assemblers. Most recent sequence assemblers are based on the construction of a de Bruijn graph. An alternative framework of growing interest is the assembly string graph, not necessitating a division of the reads into k-mers, but requiring fast algorithms for the computation of suffix-prefix matches among all pairs of reads. RESULTS: Here we present efficient methods for the construction of a string graph from a set of sequencing reads. Our approach employs suffix sorting and scanning methods to compute suffix-prefix matches. Transitive edges are recognized and eliminated early in the process and the graph is efficiently constructed including irreducible edges only. CONCLUSIONS: Our suffix-prefix match determination and string graph construction algorithms have been implemented in the software package Readjoiner. Comparison with existing string graph-based assemblers shows that Readjoiner is faster and more space efficient. Readjoiner is available at http://www.zbh.uni-hamburg.de/readjoiner.


Assuntos
Software , Algoritmos , Simulação por Computador , Genoma Humano/genética , Humanos , Modelos Genéticos , Análise de Sequência de DNA/métodos
19.
Artigo em Inglês | MEDLINE | ID: mdl-22084150

RESUMO

Today's genome analysis applications require sequence representations allowing for fast access to their contents while also being memory-efficient enough to facilitate analyses of large-scale data. While a wide variety of sequence representations exist, lack of a generic implementation of efficient sequence storage has led to a plethora of poorly reusable or programming language-specific implementations. We present a novel, space-efficient data structure (GtEncseq) for storing multiple biological sequences of variable alphabet size, with customizable character transformations, wildcard support and an assortment of internal representations optimized for different distributions of wildcards and sequence lengths. For the human genome (3.1 gigabases, including 237 million wildcard characters) our representation requires only 2 + 8 × 10^-6bits per character. Implemented in C, our portable software implementation provides a variety of methods for random and sequential access to characters and substrings (including different reading directions) using an object-oriented interface. In addition, it includes access to metadata like sequence descriptions or character distributions. The library is extensible to be used from various scripting languages. GtEncseq is much more versatile than previous solutions, adding features that were previously unavailable. Benchmarks show that it is competitive with respect to space and time requirements.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação/métodos , Análise de Sequência , Algoritmos , Modelos Genéticos , Família Multigênica
20.
PLoS One ; 6(11): e26362, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22140428

RESUMO

During cancer progression, specific genomic aberrations arise that can determine the scope of the disease and can be used as predictive or prognostic markers. The detection of specific gene amplifications or deletions in single blood-borne or disseminated tumour cells that may give rise to the development of metastases is of great clinical interest but technically challenging. In this study, we present a method for quantitative high-resolution genomic analysis of single cells. Cells were isolated under permanent microscopic control followed by high-fidelity whole genome amplification and subsequent analyses by fine tiling array-CGH and qPCR. The assay was applied to single breast cancer cells to analyze the chromosomal region centred by the therapeutical relevant EGFR gene. This method allows precise quantitative analysis of copy number variations in single cell diagnostics.


Assuntos
Genômica/métodos , Neoplasias/genética , Neoplasias/patologia , Análise de Célula Única/métodos , Linhagem Celular Tumoral , Hibridização Genômica Comparativa , Receptores ErbB/genética , Heterogeneidade Genética , Humanos , Neoplasias/sangue , Reação em Cadeia da Polimerase
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...