Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 45
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Proc Natl Acad Sci U S A ; 112(44): 13669-74, 2015 Nov 03.
Article in English | MEDLINE | ID: mdl-26474830

ABSTRACT

Cyclodextrins are cyclic oligosaccharides widely used in the pharmaceutical industry to improve drug delivery and to increase the solubility of hydrophobic compounds. Anabaenolysins are lipopeptides produced by cyanobacteria with potent lytic activity in cholesterol-containing membranes. Here, we identified the 23- to 24-kb gene clusters responsible for the production of the lipopeptide anabaenolysin. The hybrid nonribosomal peptide synthetase and polyketide synthase biosynthetic gene cluster is encoded in the genomes of three anabaenolysin-producing strains of Anabaena. We detected previously unidentified strains producing known anabaenolysins A and B and discovered the production of new variants of anabaenolysins C and D. Bioassays demonstrated that anabaenolysins have weak antifungal activity against Candida albicans. Surprisingly, addition of the hydrophilic fraction of the whole-cell extracts increased the antifungal activity of the hydrophobic anabaenolysins. The fraction contained compounds identified by NMR as α-, ß-, and γ-cyclodextrins, which undergo acetylation. Cyclodextrins have been used for decades to improve the solubility and bioavailability of many drugs including antifungal compounds. This study shows a natural example of cyclodextrins improving the solubility and efficacy of an antifungal compound in an ancient lineage of photosynthetic bacteria.


Subject(s)
Antifungal Agents/pharmacology , Bacterial Proteins/biosynthesis , Cyanobacteria/metabolism , Cyclodextrins/biosynthesis , Cyanobacteria/genetics , Genes, Bacterial , Molecular Sequence Data
2.
Antimicrob Agents Chemother ; 60(7): 3906-12, 2016 07.
Article in English | MEDLINE | ID: mdl-27067338

ABSTRACT

Efflux-mediated macrolide resistance due to mef(E) and mel, carried by the mega element, is common in Streptococcus pneumoniae, for which it was originally characterized, but it is rare in Streptococcus pyogenes In S. pyogenes, mega was previously found to be enclosed in Tn2009, a composite genetic element of the Tn916 family containing tet(M) and conferring erythromycin and tetracycline resistance. In this study, S. pyogenes isolates containing mef(E), apparently not associated with other resistance determinants, were examined to characterize the genetic context of mega. By whole-genome sequencing of one isolate, MB56Spyo009, we identified a novel composite integrative and conjugative element (ICE) carrying mega, designated ICESpy009, belonging to the ICESa2603 family. ICESpy009 was 55 kb long, contained 61 putative open reading frames (ORFs), and was found to be integrated into hylA, a novel integration site for the ICESa2603 family. The modular organization of the ICE was similar to that of members of the ICESa2603 family carried by different streptococcal species. In addition, a novel cluster of accessory resistance genes was found inside a region that encloses mega. PCR mapping targeting ICESpy009 revealed the presence of a similar ICE in five other isolates under study. While in three isolates the integration site was the same as that of ICESpy009, in two isolates the ICE was integrated into rplL, the typical integration site of the ICESa2603 family. ICESpy009 was able to transfer macrolide resistance by conjugation to both S. pyogenes and S. pneumoniae, showing the first evidence of the transferability of mega from S. pyogenes.


Subject(s)
DNA Transposable Elements/genetics , Drug Resistance, Bacterial/genetics , Streptococcus pyogenes/genetics , Anti-Bacterial Agents/pharmacology , Erythromycin/pharmacology , Macrolides/pharmacology , Microbial Sensitivity Tests , Streptococcus pyogenes/drug effects , Tetracycline Resistance/genetics
3.
Angew Chem Int Ed Engl ; 55(11): 3596-9, 2016 Mar 07.
Article in English | MEDLINE | ID: mdl-26846478

ABSTRACT

Cyanobactins are a rapidly growing family of linear and cyclic peptides produced by cyanobacteria. Kawaguchipeptins A and B, two macrocyclic undecapeptides reported earlier from Microcystis aeruginosa NIES-88, are shown to be products of the cyanobactin biosynthetic pathway. The 9 kb kawaguchipeptin (kgp) gene cluster was identified in a 5.26 Mb draft genome of Microcystis aeruginosa NIES-88. We verified that this gene cluster is responsible for the production of the kawaguchipeptins through heterologous expression of the kgp gene cluster in Escherichia coli. The KgpF prenyltransferase was overexpressed and was shown to prenylate C-3 of Trp residues in both linear and cyclic peptides in vitro. Our findings serve to further enhance the structural diversity of cyanobactins to include tryptophan-prenylated cyclic peptides.


Subject(s)
Dimethylallyltranstransferase/metabolism , Tryptophan/metabolism , Amino Acid Sequence , Dimethylallyltranstransferase/chemistry , Escherichia coli/genetics , Genome, Bacterial , Microcystis/genetics , Multigene Family
4.
BMC Genomics ; 16: 100, 2015 Feb 19.
Article in English | MEDLINE | ID: mdl-25766668

ABSTRACT

BACKGROUND: Small RNAs include different classes essential for endogenous gene regulation and cellular defence against genomic parasites. However, a comprehensive analysis of the small RNA pathways in the germline of the mosquito Anopheles gambiae has never been performed despite their potential relevance to reproductive capacity in this malaria vector. RESULTS: We performed small RNA deep sequencing during larval and adult gonadogenesis and find that they predominantly express four classes of regulatory small RNAs. We identified 45 novel miRNA precursors some of which were sex-biased and gonad-enriched , nearly doubling the number of previously known miRNA loci. We also determine multiple genomic clusters of 24-30 nt Piwi-interacting RNAs (piRNAs) that map to transposable elements (TEs) and 3'UTR of protein coding genes. Unusually, many TEs and the 3'UTR of some endogenous genes produce an abundant peak of 29-nt small RNAs with piRNA-like characteristics. Moreover, both sense and antisense piRNAs from TEs in both Anopheles gambiae and Drosophila melanogaster reveal novel features of piRNA sequence bias. We also discovered endogenous small interfering RNAs (endo-siRNAs) that map to overlapping transcripts and TEs. CONCLUSIONS: This is the first description of the germline miRNome in a mosquito species and should prove a valuable resource for understanding gene regulation that underlies gametogenesis and reproductive capacity. We also provide the first evidence of a piRNA pathway that is active against transposons in the germline and our findings suggest novel piRNA sequence bias. The contribution of small RNA pathways to germline TE regulation and genome defence in general is an important finding for approaches aimed at manipulating mosquito populations through the use of selfish genetic elements.


Subject(s)
Culicidae/genetics , Malaria/genetics , MicroRNAs/biosynthesis , RNA, Small Interfering/biosynthesis , Animals , Culicidae/growth & development , Gene Expression Regulation, Developmental , Genome, Insect , Germ Cells , Gonads , High-Throughput Nucleotide Sequencing , Malaria/parasitology , MicroRNAs/genetics , RNA, Small Interfering/genetics
5.
J Hum Evol ; 82: 88-94, 2015 May.
Article in English | MEDLINE | ID: mdl-25805042

ABSTRACT

In 1993, a fossil hominin skeleton was discovered in the karst caves of Lamalunga, near Altamura, in southern Italy. Despite the fact that this specimen represents one of the most extraordinary hominin specimens ever found in Europe, for the last two decades our knowledge of it has been based purely on the documented on-site observations. Recently, the retrieval from the cave of a fragment of bone (part of the right scapula) allowed the first dating of the individual, the quantitative analysis of a diagnostic morphological feature, and a preliminary paleogenetic characterization of this hominin skeleton from Altamura. Overall, the results concur in indicating that it belongs to the hypodigm of Homo neanderthalensis, with some phenetic peculiarities that appear consistent with a chronology ranging from 172 ± 15 ka to 130.1 ± 1.9 ka. Thus, the skeleton from Altamura represents the most ancient Neanderthal from which endogenous DNA has ever been extracted.


Subject(s)
Caves , Fossils , Neanderthals , Paleontology/methods , Skeleton , Animals , Base Sequence , DNA/analysis , History, Ancient , Italy , Molecular Sequence Data , Phylogeny , Scapula/chemistry , Skeleton/chemistry
6.
Proc Natl Acad Sci U S A ; 109(25): 9935-40, 2012 Jun 19.
Article in English | MEDLINE | ID: mdl-22665810

ABSTRACT

Sialic acid-recognizing Ig-like lectins (Siglecs) are signaling receptors that modulate immune responses, and are targeted for interactions by certain pathogens. We describe two primate Siglecs that were rendered nonfunctional by single genetic events during hominin evolution after our common ancestor with the chimpanzee. SIGLEC13 was deleted by an Alu-mediated recombination event, and a single base pair deletion disrupted the ORF of SIGLEC17. Siglec-13 is expressed on chimpanzee monocytes, innate immune cells that react to bacteria. The human SIGLEC17P pseudogene mRNA is still expressed at high levels in human natural killer cells, which bridge innate and adaptive immune responses. As both resulting pseudogenes are homozygous in all human populations, we resurrected the originally encoded proteins and examined their functions. Chimpanzee Siglec-13 and the resurrected human Siglec-17 recruit a signaling adapter and bind sialic acids. Expression of either Siglec in innate immune cells alters inflammatory cytokine secretion in response to Toll-like receptor-4 stimulation. Both Siglecs can also be engaged by two potentially lethal sialylated bacterial pathogens of newborns and infants, agents with a potential impact on reproductive fitness. Neanderthal and Denisovan genomes show human-like sequences at both loci, corroborating estimates that the initial pseudogenization events occurred in the common ancestral population of these hominins. Both loci also show limited polymorphic diversity, suggesting selection forces predating the origin of modern humans. Taken together, these data suggest that genetic elimination of Siglec-13 and/or Siglec-17 represents signatures of infectious and/or other inflammatory selective processes contributing to population restrictions during hominin origins.


Subject(s)
Evolution, Molecular , Gene Silencing , Lectins/genetics , Animals , Gene Deletion , Humans , Immune System , Primates , Sialic Acid Binding Immunoglobulin-like Lectins
7.
Blood ; 116(25): 5507-17, 2010 Dec 16.
Article in English | MEDLINE | ID: mdl-20864581

ABSTRACT

Integration of retroviral vectors in the human genome follows nonrandom patterns that favor insertional deregulation of gene expression and increase the risk of their use in clinical gene therapy. The molecular basis of retroviral target site selection is still poorly understood. We used deep sequencing technology to build genomewide, high-definition maps of > 60 000 integration sites of Moloney murine leukemia virus (MLV)- and HIV-based retroviral vectors in the genome of human CD34(+) multipotent hematopoietic progenitor cells (HPCs) and used gene expression profiling, chromatin immunoprecipitation, and bioinformatics to associate integration to genetic and epigenetic features of the HPC genome. Clusters of recurrent MLV integrations identify regulatory elements (alternative promoters, enhancers, evolutionarily conserved noncoding regions) within or around protein-coding genes and microRNAs with crucial functions in HPC growth and differentiation, bearing epigenetic marks of active or poised transcription (H3K4me1, H3K4me2, H3K4me3, H3K9Ac, Pol II) and specialized chromatin configurations (H2A.Z). Overall, we mapped 3500 high-frequency integration clusters, which represent a new resource for the identification of transcriptionally active regulatory elements. High-definition MLV integration maps provide a rational basis for predicting genotoxic risks in gene therapy and a new tool for genomewide identification of promoters and regulatory elements controlling hematopoietic stem and progenitor cell functions.


Subject(s)
Genome, Human , Hematopoietic Stem Cells/physiology , Regulatory Elements, Transcriptional/genetics , Retroviridae/genetics , Virus Integration/genetics , Biomarkers/metabolism , Cells, Cultured , Chromatin/genetics , Chromatin Immunoprecipitation , Epigenomics , Fetal Blood/cytology , Gene Expression Profiling , HIV/genetics , High-Throughput Nucleotide Sequencing , Humans , Moloney murine leukemia virus/genetics , Oligonucleotide Array Sequence Analysis , Promoter Regions, Genetic/genetics
8.
Genet Sel Evol ; 44: 21, 2012 Jul 06.
Article in English | MEDLINE | ID: mdl-22697611

ABSTRACT

In spite of past controversies, the field of ancient DNA is now a reliable research area due to recent methodological improvements. A series of recent large-scale studies have revealed the true potential of ancient DNA samples to study the processes of evolution and to test models and assumptions commonly used to reconstruct patterns of evolution and to analyze population genetics and palaeoecological changes. Recent advances in DNA technologies, such as next-generation sequencing make it possible to recover DNA information from archaeological and paleontological remains allowing us to go back in time and study the genetic relationships between extinct organisms and their contemporary relatives. With the next-generation sequencing methodologies, DNA sequences can be retrieved even from samples (for example human remains) for which the technical pitfalls of classical methodologies required stringent criteria to guaranty the reliability of the results. In this paper, we review the methodologies applied to ancient DNA analysis and the perspectives that next-generation sequencing applications provide in this field.


Subject(s)
DNA, Mitochondrial/genetics , Evolution, Molecular , Sequence Analysis, DNA/methods , Animals , Base Sequence , Cell Nucleus/genetics , DNA Damage , DNA, Mitochondrial/analysis , Extinction, Biological , Genome, Human , Humans , Mitochondria/genetics , Polymerase Chain Reaction/methods , Polymerase Chain Reaction/standards , Time Factors
10.
BMC Evol Biol ; 11: 32, 2011 Jan 31.
Article in English | MEDLINE | ID: mdl-21281509

ABSTRACT

BACKGROUND: Bos primigenius, the aurochs, is the wild ancestor of modern cattle breeds and was formerly widespread across Eurasia and northern Africa. After a progressive decline, the species became extinct in 1627. The origin of modern taurine breeds in Europe is debated. Archaeological and early genetic evidence point to a single Near Eastern origin and a subsequent spread during the diffusion of herding and farming. More recent genetic data are instead compatible with local domestication events or at least some level of local introgression from the aurochs. Here we present the analysis of the complete mitochondrial genome of a pre-Neolithic Italian aurochs. RESULTS: In this study, we applied a combined strategy employing both multiplex PCR amplifications and 454 pyrosequencing technology to sequence the complete mitochondrial genome of an 11,450-year-old aurochs specimen from Central Italy. Phylogenetic analysis of the aurochs mtDNA genome supports the conclusions from previous studies of short mtDNA fragments--namely that Italian aurochsen were genetically very similar to modern cattle breeds, but highly divergent from the North-Central European aurochsen. CONCLUSIONS: Complete mitochondrial genome sequences are now available for several modern cattle and two pre-Neolithic mtDNA genomes from very different geographic areas. These data suggest that previously identified sub-groups within the widespread modern cattle mitochondrial T clade are polyphyletic, and they support the hypothesis that modern European breeds have multiple geographic origins.


Subject(s)
Cattle/genetics , Genome, Mitochondrial , Paleontology , Animals , Cattle/classification , DNA, Mitochondrial/genetics , Evolution, Molecular , Italy , Molecular Sequence Data , Phylogeny
11.
Curr Biol ; 18(21): 1687-93, 2008 Nov 11.
Article in English | MEDLINE | ID: mdl-18976917

ABSTRACT

The Tyrolean Iceman was a witness to the Neolithic-Copper Age transition in Central Europe 5350-5100 years ago, and his mummified corpse was recovered from an Alpine glacier on the Austro-Italian border in 1991 [1]. Using a mixed sequencing procedure based on PCR amplification and 454 sequencing of pooled amplification products, we have retrieved the first complete mitochondrial-genome sequence of a prehistoric European. We have then compared it with 115 related extant lineages from mitochondrial haplogroup K. We found that the Iceman belonged to a branch of mitochondrial haplogroup K1 that has not yet been identified in modern European populations. This is the oldest complete Homo sapiens mtDNA genome generated to date. The results point to the potential significance of complete-ancient-mtDNA studies in addressing questions concerning the genetic history of human populations that the phylogeography of modern lineages is unable to tackle.


Subject(s)
DNA, Mitochondrial/genetics , Genome, Mitochondrial , Mummies , Humans , Male , Phylogeny , Preservation, Biological , Sequence Analysis, DNA
12.
Appl Environ Microbiol ; 77(20): 7271-8, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21873484

ABSTRACT

Cyanobacterial mass occurrences are common in fresh and brackish waters. They pose a threat to water users due to toxins frequently produced by the cyanobacterial species present. Anatoxin-a and homoanatoxin-a are neurotoxins synthesized by various cyanobacteria, e.g., Anabaena, Oscillatoria, and Aphanizomenon. The biosynthesis of these toxins and the genes involved in anatoxin production were recently described for Oscillatoria sp. strain PCC 6506 (A. Méjean et al., J. Am. Chem. Soc. 131:7512-7513, 2009). In this study, we identified the anatoxin synthetase gene cluster (anaA to anaG and orf1; 29 kb) in Anabaena sp. strain 37. The gene (81.6% to 89.2%) and amino acid (78.8% to 86.9%) sequences were highly similar to those of Oscillatoria sp. PCC 6506, while the organization of the genes differed. Molecular detection methods for potential anatoxin-a and homoanatoxin-a producers of the genera Anabaena, Aphanizomenon, and Oscillatoria were developed by designing primers to recognize the anaC gene. Anabaena and Oscillatoria anaC genes were specifically identified in several cyanobacterial strains by PCR. Restriction fragment length polymorphism (RFLP) analysis of the anaC amplicons enabled simultaneous identification of three producer genera: Anabaena, Oscillatoria, and Aphanizomenon. The molecular methods developed in this study revealed the presence of both Anabaena and Oscillatoria as potential anatoxin producers in Finnish fresh waters and the Baltic Sea; they could be applied for surveys of these neurotoxin producers in other aquatic environments.


Subject(s)
Anabaena/genetics , Anabaena/metabolism , Biosynthetic Pathways/genetics , Ligases/genetics , Multigene Family , Tropanes/metabolism , Aphanizomenon/genetics , Bacterial Proteins/genetics , Cyanobacteria Toxins , DNA Primers/genetics , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Gene Order , Molecular Sequence Data , Oscillatoria/genetics , Polymerase Chain Reaction , Polymorphism, Restriction Fragment Length , Sequence Analysis, DNA , Sequence Homology, Nucleic Acid
13.
BMC Microbiol ; 11: 25, 2011 Feb 01.
Article in English | MEDLINE | ID: mdl-21284853

ABSTRACT

BACKGROUND: Streptococcus pneumoniae is an important human pathogen representing a major cause of morbidity and mortality worldwide. We sequenced the genome of a serotype 11A, ST62 S. pneumoniae invasive isolate (AP200), that was erythromycin-resistant due to the presence of the erm(TR) determinant, and carried out analysis of the genome organization and comparison with other pneumococcal genomes. RESULTS: The genome sequence of S. pneumoniae AP200 is 2,130,580 base pair in length. The genome carries 2216 coding sequences (CDS), 56 tRNA, and 12 rRNA genes. Of the CDSs, 72.9% have a predicted biological known function. AP200 contains the pilus islet 2 and, although its phenotype corresponds to serotype 11A, it contains an 11D capsular locus. Chromosomal rearrangements resulting from a large inversion across the replication axis, and horizontal gene transfer events were observed. The chromosomal inversion is likely implicated in the rebalance of the chromosomal architecture affected by the insertions of two large exogenous elements, the erm(TR)-carrying Tn1806 and a functional prophage designated φSpn_200. Tn1806 is 52,457 bp in size and comprises 49 ORFs. Comparative analysis of Tn1806 revealed the presence of a similar genetic element or part of it in related species such as Streptococcus pyogenes and also in the anaerobic species Finegoldia magna, Anaerococcus prevotii and Clostridium difficile. The genome of φSpn_200 is 35,989 bp in size and is organized in 47 ORFs grouped into five functional modules. Prophages similar to φSpn_200 were found in pneumococci and in other streptococcal species, showing a high degree of exchange of functional modules. φSpn_200 viral particles have morphologic characteristics typical of the Siphoviridae family and are capable of infecting a pneumococcal recipient strain. CONCLUSIONS: The sequence of S. pneumoniae AP200 chromosome revealed a dynamic genome, characterized by chromosomal rearrangements and horizontal gene transfers. The overall diversity of AP200 is driven mainly by the presence of the exogenous elements Tn1806 and φSpn_200 that show large gene exchanges with other genetic elements of different bacterial species. These genetic elements likely provide AP200 with additional genes, such as those conferring antibiotic-resistance, promoting its adaptation to the environment.


Subject(s)
Genome, Bacterial , Streptococcus pneumoniae/genetics , Chromosomes, Bacterial/genetics , DNA Transposable Elements , DNA, Bacterial/genetics , Molecular Sequence Annotation , Molecular Sequence Data , Prophages/genetics , Sequence Analysis, DNA , Streptococcus pneumoniae/classification , Streptococcus pneumoniae/isolation & purification
14.
Proc Natl Acad Sci U S A ; 105(46): 17670-5, 2008 Nov 18.
Article in English | MEDLINE | ID: mdl-19001273

ABSTRACT

DNA encoding facilitates the construction and screening of large chemical libraries. Here, we describe general strategies for the stepwise coupling of coding DNA fragments to nascent organic molecules throughout individual reaction steps as well as the first implementation of high-throughput sequencing for the identification and relative quantification of the library members. The methodology was exemplified in the construction of a DNA-encoded chemical library containing 4,000 compounds and in the discovery of binders to streptavidin, matrix metalloproteinase 3, and polyclonal human IgG.


Subject(s)
DNA/analysis , Sequence Analysis, DNA/methods , Small Molecule Libraries/chemistry , Fluorescence Polarization , Humans , Immunoglobulin G/metabolism , Kinetics , Matrix Metalloproteinase 3/metabolism , Streptavidin/metabolism
15.
Sci Rep ; 10(1): 10700, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32612271

ABSTRACT

Umbria is located in Central Italy and took the name from its ancient inhabitants, the Umbri, whose origins are still debated. Here, we investigated the mitochondrial DNA (mtDNA) variation of 545 present-day Umbrians (with 198 entire mitogenomes) and 28 pre-Roman individuals (obtaining 19 ancient mtDNAs) excavated from the necropolis of Plestia. We found a rather homogeneous distribution of western Eurasian lineages across the region, with few notable exceptions. Contemporary inhabitants of the eastern part, delimited by the Tiber River and the Apennine Mountains, manifest a peculiar mitochondrial proximity to central-eastern Europeans, mainly due to haplogroups U4 and U5a, and an overrepresentation of J (30%) similar to the pre-Roman remains, also excavated in East Umbria. Local genetic continuities are further attested to by six terminal branches (H1e1, J1c3, J2b1, U2e2a, U8b1b1 and K1a4a) shared between ancient and modern mitogenomes. Eventually, we identified multiple inputs from various population sources that likely shaped the mitochondrial gene pool of ancient Umbri over time, since early Neolithic, including gene flows with central-eastern Europe. This diachronic mtDNA portrait of Umbria fits well with the genome-wide population structure identified on the entire peninsula and with historical sources that list the Umbri among the most ancient Italic populations.


Subject(s)
DNA, Mitochondrial/genetics , Demography , Genome, Mitochondrial/genetics , Human Migration , White People/genetics , Anthropology/methods , Gene Pool , Genetic Variation/genetics , Genetics, Population/methods , Geography , Humans , Italy , Mediterranean Region , Phylogeny
16.
BMC Genomics ; 10: 163, 2009 Apr 20.
Article in English | MEDLINE | ID: mdl-19379481

ABSTRACT

BACKGROUND: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. RESULTS: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. CONCLUSION: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.


Subject(s)
Breast Neoplasms/genetics , Gene Expression Profiling , Sequence Analysis, DNA , Transcription, Genetic , Amino Acid Sequence , Base Sequence , Breast Neoplasms/metabolism , Calmodulin-Binding Proteins/genetics , Computational Biology , Cytoskeletal Proteins/genetics , DNA, Complementary/chemistry , Databases, Genetic , Female , Gene Expression Regulation, Neoplastic , Histone-Lysine N-Methyltransferase/genetics , Humans , Molecular Sequence Data , Nuclear Proteins/genetics , RNA, Untranslated/metabolism , Reverse Transcriptase Polymerase Chain Reaction , Sequence Alignment , Sequence Analysis, RNA , Ubiquitin-Protein Ligases
17.
BMC Genomics ; 9: 174, 2008 Apr 16.
Article in English | MEDLINE | ID: mdl-18416813

ABSTRACT

BACKGROUND: Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm). RESULTS: For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression. CONCLUSION: Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.


Subject(s)
Evolution, Molecular , Genes, Overlapping/genetics , Phylogeny , Animals , Breast Neoplasms/genetics , Caenorhabditis elegans/genetics , Conserved Sequence/genetics , Drosophila melanogaster/genetics , Gene Library , Humans , Mice , Models, Genetic , Zebrafish/genetics
18.
BMC Genomics ; 9: 464, 2008 Oct 08.
Article in English | MEDLINE | ID: mdl-18842124

ABSTRACT

BACKGROUND: A new priority in genome research is large-scale resequencing of genes to understand the molecular basis of hereditary disease and cancer. We assessed the ability of massively parallel pyrosequencing to identify sequence variants in pools. From a large collection of human PCR samples we selected 343 PCR products belonging to 16 disease genes and including a large spectrum of sequence variations previously identified by Sanger sequencing. The sequence variants included SNPs and small deletions and insertions (up to 44 bp), in homozygous or heterozygous state. RESULTS: The DNA was combined in 4 pools containing from 27 to 164 amplicons and from 8,9 to 50,8 Kb to sequence for a total of 110 Kb. Pyrosequencing generated over 80 million base pairs of data. Blind searching for sequence variations with a specifically designed bioinformatics procedure identified 465 putative sequence variants, including 412 true variants, 53 false positives (in or adjacent to homopolymeric tracts), no false negatives. All known variants in positions covered with at least 30x depth were correctly recognized. CONCLUSION: Massively parallel pyrosequencing may be used to simplify and speed the search for DNA variations in PCR products. Our results encourage further studies to evaluate molecular diagnostics applications.


Subject(s)
Genomics/methods , Sequence Analysis, DNA/methods , Genetic Diseases, Inborn/genetics , Genetic Variation/genetics , Humans , Mutation/genetics , Neoplasms/genetics , Polymorphism, Single Nucleotide/genetics
19.
Environ Microbiol ; 10(3): 653-64, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18190512

ABSTRACT

We developed a new tool to detect and identify hepatotoxin-producing cyanobacteria of the genera Anabaena, Microcystis, Planktothrix, Nostoc and Nodularia. Genus-specific probe pairs were designed for the detection of the microcystin (mcyE) and nodularin synthetase genes (ndaF) of these five genera to be used with a DNA-chip. The method couples a ligation detection reaction, in which the polymerase chain reaction (PCR)-amplified mcyE/ndaF genes are recognized by the probe pairs, with a hybridization on a universal microarray. All the probe pairs specifically detected the corresponding mcyE/ndaF gene sequences when DNA from the microcystin- or nodularin-producing cyanobacterial strains were used as template in the PCR. Furthermore, the strict specificity of detection enabled identification of the potential hepatotoxin producers. Detection of the genes was very sensitive; only 1-5 fmol of the PCR product were needed to produce signal intensities that exceeded the set background threshold level. The genus-specific probe pairs also reliably detected potential microcystin producers in DNA extracted from six lake and four brackish water samples. In lake samples, the same microcystin producers were identified with quantitative real-time PCR analysis. The specificity, sensitivity and ability of the DNA-chip in simultaneously detecting all the main hepatotoxin producers make this method suitable for high-throughput analysis and monitoring of environmental samples.


Subject(s)
Cyanobacteria/isolation & purification , Fresh Water/microbiology , Microcystins/biosynthesis , Oligonucleotide Array Sequence Analysis/methods , Peptides, Cyclic/metabolism , Bacterial Proteins/genetics , Bacterial Toxins/analysis , Bacterial Typing Techniques , Cyanobacteria/genetics , Cyanobacteria/metabolism , DNA Fingerprinting , DNA, Bacterial/genetics , Ecosystem , Fresh Water/analysis , Microcystins/analysis , Polymerase Chain Reaction
20.
BMC Bioinformatics ; 8 Suppl 1: S22, 2007 Mar 08.
Article in English | MEDLINE | ID: mdl-17430567

ABSTRACT

BACKGROUND: New high throughput pyrosequencers such as the 454 Life Sciences GS 20 are capable of massively parallelizing DNA sequencing providing an unprecedented rate of output data as well as potentially reducing costs. However, these new pyrosequencers bear a different error profile and provide shorter reads than those of a more traditional Sanger sequencer. These facts pose new challenges regarding how the data are handled and analyzed, in addition, the steep increase in the sequencers throughput calls for much computation power at a low cost. RESULTS: To address these challenges, we created an automated multi-step computation pipeline integrated with a database storage system. This allowed us to store, handle, index and search (1) the output data from the GS20 sequencer (2) analysis projects, possibly multiple on every dataset (3) final results of analysis computations (4) intermediate results of computations (these allow hand-made comparisons and hence further searches by the biologists). Repeatability of computations was also a requirement. In order to access the needed computation power, we ported the pipeline to the European Grid: a large community of clusters, load balanced as a whole. In order to better achieve this Grid port we created Vnas: an innovative Grid job submission, virtual sandbox manager and job callback framework. After some runs of the pipeline aimed at tuning the parameters and thresholds for optimal results, we successfully analyzed 273 sequenced amplicons from a cancerous human sample and correctly found punctual mutations confirmed by either Sanger resequencing or NCBI dbSNP. The sequencing was performed with our 454 Life Sciences GS 20 pyrosequencer. CONCLUSION: We handled the steep increase in throughput from the new pyrosequencer by building an automated computation pipeline associated with database storage, and by leveraging the computing power of the European Grid. The Grid platform offers a very cost effective choice for uneven workloads, typical in many scientific research fields, provided its peculiarities can be accepted (these are discussed). The mentioned infrastructure was used to analyze human amplicons for mutations. More analyses will be performed in the future.


Subject(s)
Algorithms , DNA/chemistry , DNA/genetics , Database Management Systems , Databases, Genetic , Information Storage and Retrieval/methods , Sequence Analysis, DNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL