Search | VHL Regional Portal

1.

OrthoMaM v12: a database of curated single-copy ortholog alignments and trees to study mammalian evolutionary genomics.

Allio, Rémi; Delsuc, Frédéric; Belkhir, Khalid; Douzery, Emmanuel J P; Ranwez, Vincent; Scornavacca, Céline.

Nucleic Acids Res ; 52(D1): D529-D535, 2024 Jan 05.

Article in English | MEDLINE | ID: mdl-37843103

ABSTRACT

To date, the databases built to gather information on gene orthology do not provide end-users with descriptors of the molecular evolution information and phylogenetic pattern of these orthologues. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of coding sequences in mammalian genomes. OrthoMaM version 12 includes 15,868 alignments of orthologous coding sequences (CDS) from the 190 complete mammalian genomes currently available. All annotations and 1-to-1 orthology assignments are based on NCBI. Orthologous CDS can be mined for potential informative markers at the different taxonomic levels of the mammalian tree. To this end, several evolutionary descriptors of DNA sequences are provided for querying purposes (e.g. base composition and relative substitution rate). The graphical web interface allows the user to easily browse and sort the results of combined queries. The corresponding multiple sequence alignments and ML trees, inferred using state-of-the art approaches, are available for download both at the nucleotide and amino acid levels. OrthoMaM v12 can be used by researchers interested either in reconstructing the phylogenetic relationships of mammalian taxa or in understanding the evolutionary dynamics of coding sequences in their genomes. OrthoMaM is available for browsing, querying and complete or filtered download at https://orthomam.mbb.cnrs.fr/.

Subject(s)

Databases, Genetic , Genomics , Animals , Base Sequence , Genome , Genomics/methods , Mammals/classification , Mammals/genetics , Phylogeny , Biological Evolution

2.

Murine leukemia virus (MLV) P50 protein induces cell transformation via transcriptional regulatory function.

Akkawi, Charbel; Feuillard, Jerome; Diaz, Felipe Leon; Belkhir, Khalid; Godefroy, Nelly; Peloponese, Jean-Marie; Mougel, Marylene; Laine, Sebastien.

Retrovirology ; 20(1): 16, 2023 09 12.

Article in English | MEDLINE | ID: mdl-37700325

ABSTRACT

BACKGROUND: The murine leukemia virus (MLV) has been a powerful model of pathogenesis for the discovery of genes involved in cancer. Its splice donor (SD')-associated retroelement (SDARE) is important for infectivity and tumorigenesis, but the mechanism remains poorly characterized. Here, we show for the first time that P50 protein, which is produced from SDARE, acts as an accessory protein that transregulates transcription and induces cell transformation. RESULTS: By infecting cells with MLV particles containing SDARE transcript alone (lacking genomic RNA), we show that SDARE can spread to neighbouring cells as shown by the presence of P50 in infected cells. Furthermore, a role for P50 in cell transformation was demonstrated by CCK8, TUNEL and anchorage-independent growth assays. We identified the integrase domain of P50 as being responsible for transregulation of the MLV promoter using luciferase assay and RTqPCR with P50 deleted mutants. Transcriptomic analysis furthermore revealed that the expression of hundreds of cellular RNAs involved in cancerogenesis were deregulated in the presence of P50, suggesting that P50 induces carcinogenic processes via its transcriptional regulatory function. CONCLUSION: We propose a novel SDARE-mediated mode of propagation of the P50 accessory protein in surrounding cells. Moreover, due to its transforming properties, P50 expression could lead to a cellular and tissue microenvironment that is conducive to cancer development.

Subject(s)

Gene Expression Profiling , Gene Expression Regulation , Mice , Animals , Genomics , Leukemia Virus, Murine/genetics , Promoter Regions, Genetic , RNA

3.

Nanopore sequencing of PCR products enables multicopy gene family reconstruction.

Namias, Alice; Sahlin, Kristoffer; Makoundou, Patrick; Bonnici, Iago; Sicard, Mathieu; Belkhir, Khalid; Weill, Mylène.

Comput Struct Biotechnol J ; 21: 3656-3664, 2023.

Article in English | MEDLINE | ID: mdl-37533804

ABSTRACT

The importance of gene amplifications in evolution is more and more recognized. Yet, tools to study multi-copy gene families are still scarce, and many such families are overlooked using common sequencing methods. Haplotype reconstruction is even harder for polymorphic multi-copy gene families. Here, we show that all variants (or haplotypes) of a multi-copy gene family present in a single genome, can be obtained using Oxford Nanopore Technologies sequencing of PCR products, followed by steps of mapping, SNP calling and haplotyping. As a proof of concept, we acquired the sequences of highly similar variants of the cidA and cidB genes present in the genome of the Wolbachia wPip, a bacterium infecting Culex pipiens mosquitoes. Our method relies on a wide database of cid genes, previously acquired by cloning and Sanger sequencing. We addressed problems commonly faced when using mapping approaches for multi-copy gene families with highly similar variants. In addition, we confirmed that PCR amplification causes frequent chimeras which have to be carefully considered when working on families of recombinant genes. We tested the robustness of the method using a combination of bioinformatics (read simulations) and molecular biology approaches (sequence acquisitions through cloning and Sanger sequencing, specific PCRs and digital droplet PCR). When different haplotypes present within a single genome cannot be reconstructed from short reads sequencing, this pipeline confers a high throughput acquisition, gives reliable results as well as insights of the relative copy numbers of the different variants.

4.

Molecular complexity and gene expression controlling cell turnover during a digestive cycle of carnivorous sponge Lycopodina hypogea.

Le Goff, Emilie; Martinand-Mari, Camille; Belkhir, Khalid; Vacelet, Jean; Nidelet, Sabine; Godefroy, Nelly; Baghdiguian, Stephen.

Cell Tissue Res ; 388(2): 399-416, 2022 May.

Article in English | MEDLINE | ID: mdl-35260936

ABSTRACT

Lycopodina hypogea is a carnivorous sponge that tolerates laboratory husbandry very well. During a digestion cycle, performed without any digestive cavity, this species undergoes spectacular morphological changes leading to a total regression of long filaments that ensure the capture of prey and their reformation at the end of the cycle. This phenomenon is a unique opportunity to analyze the molecular and cellular determinants that ensure digestion in the sister group of all other metazoans. Using differential transcriptomic analysis coupled with cell biology studies of proliferation, differentiation, and programmed cell deaths (i.e., autophagy and the destructive/constructive function of apoptosis), we demonstrate that the molecular and cellular actors that ensure digestive homeostasis in a sister group of all remaining animals are similar in variety and complexity to those controlling tissue homeostasis in higher vertebrates. During a digestion cycle, most of these actors are finely tuned in a coordinated manner. Our data benefits from complementary approaches coupling in silico and cell biology studies and demonstrate that the nutritive function is provided by the coordination of molecular network that impacts the cells turnover in the entire organism.

Subject(s)

Apoptosis , Carnivory , Animals , Gene Expression

5.

The role of copy-number variation in the reinforcement of sexual isolation between the two European subspecies of the house mouse.

North, Henry L; Caminade, Pierre; Severac, Dany; Belkhir, Khalid; Smadja, Carole M.

Philos Trans R Soc Lond B Biol Sci ; 375(1806): 20190540, 2020 08 31.

Article in English | MEDLINE | ID: mdl-32654648

ABSTRACT

Reinforcement has the potential to generate strong reproductive isolation through the evolution of barrier traits as a response to selection against maladaptive hybridization, but the genetic changes associated with this process remain largely unexplored. Building upon the increasing evidence for a role of structural variants in adaptation and speciation, we addressed the role of copy-number variation in the reinforcement of sexual isolation evidenced between the two European subspecies of the house mouse. We characterized copy-number divergence between populations of Mus musculus musculus that display assortative mate choice, and those that do not, using whole-genome resequencing data. Updating methods to detect deletions and tandem duplications (collectively: copy-number variants, CNVs) in Pool-Seq data, we developed an analytical pipeline dedicated to identifying genomic regions showing the expected pattern of copy-number displacement under a reinforcement scenario. This strategy allowed us to detect 1824 deletions and seven tandem duplications that showed extreme differences in frequency between behavioural classes across replicate comparisons. A subset of 480 deletions and four tandem duplications were specifically associated with the derived trait of assortative mate choice. These 'Choosiness-associated' CNVs occur in hundreds of genes. Consistent with our hypothesis, such genes included olfactory receptors potentially involved in the olfactory-based assortative mate choice in this system as well as one gene, Sp110, that is known to show patterns of differential expression between behavioural classes in an organ used in mate choice-the vomeronasal organ. These results demonstrate that fine-scale structural changes are common and highly variable within species, despite being under-studied, and may be important targets of reinforcing selection in this system and others. This article is part of the theme issue 'Towards the completion of speciation: the evolution of reproductive isolation beyond the first barriers'.

Subject(s)

DNA Copy Number Variations , Mice/physiology , Reproductive Isolation , Animals , Europe , Mice/genetics

6.

Sponge digestive system diversity and evolution: filter feeding to carnivory.

Godefroy, Nelly; Le Goff, Emilie; Martinand-Mari, Camille; Belkhir, Khalid; Vacelet, Jean; Baghdiguian, Stephen.

Cell Tissue Res ; 377(3): 341-351, 2019 Sep.

Article in English | MEDLINE | ID: mdl-31053892

ABSTRACT

Sponges are an ancient basal life form, so understanding their evolution is key to understanding all metazoan evolution. Sponges have very unusual feeding mechanisms, with an intricate network of progressively optimized filtration units: from the simple choanocyte lining of a central cavity, or spongocoel, to more complex chambers and canals. Furthermore, in a single evolutionary event, a group of sponges transitioned to carnivory. This major evolutionary transition involved replacing the filter-feeding apparatus with mobile phagocytic cells that migrate collectively towards the trapped prey. Here, we focus on the diversity and evolution of sponge nutrition systems and the amazing adaptation to carnivory.

Subject(s)

Carnivory/psychology , Digestive System/growth & development , Porifera/physiology , Animals , Biological Evolution , Morphogenesis , Phylogeny

7.

Parallel pattern of differentiation at a genomic island shared between clinal and mosaic hybrid zones in a complex of cryptic seahorse lineages.

Riquet, Florentine; Liautard-Haag, Cathy; Woodall, Lucy; Bouza, Carmen; Louisy, Patrick; Hamer, Bojan; Otero-Ferrer, Francisco; Aublanc, Philippe; Béduneau, Vickie; Briard, Olivier; El Ayari, Tahani; Hochscheid, Sandra; Belkhir, Khalid; Arnaud-Haond, Sophie; Gagnaire, Pierre-Alexandre; Bierne, Nicolas.

Evolution ; 73(4): 817-835, 2019 04.

Article in English | MEDLINE | ID: mdl-30854632

ABSTRACT

Diverging semi-isolated lineages either meet in narrow clinal hybrid zones, or have a mosaic distribution associated with environmental variation. Intrinsic reproductive isolation is often emphasized in the former and local adaptation in the latter, although both reduce gene flow between groups. Rarely are these two patterns of spatial distribution reported in the same study system. Here, we report that the long-snouted seahorse Hippocampus guttulatus is subdivided into discrete panmictic entities by both types of hybrid zones. Along the European Atlantic coasts, a northern and a southern lineage meet in the southwest of France where they coexist in sympatry-i.e., in the same geographical zone-with little hybridization. In the Mediterranean Sea, two lineages have a mosaic distribution, associated with lagoon-like and marine habitats. A fifth lineage was identified in the Black Sea. Genetic homogeneity over large spatial scales contrasts with isolation maintained in sympatry or close parapatry at a fine scale. A high variation in locus-specific introgression rates provides additional evidence that partial reproductive isolation must be maintaining the divergence. We find that fixed differences between lagoon and marine populations in the Mediterranean Sea belong to the most differentiated SNPs between the two Atlantic lineages, against the genome-wide pattern of structure that mostly follow geography. These parallel outlier SNPs cluster on a single chromosome-wide island of differentiation. Since Atlantic lineages do not map to lagoon-sea habitat variation, genetic parallelism at the genomic island suggests a shared genetic barrier contributes to reproductive isolation in contrasting contexts-i.e., spatial versus ecological. We discuss how a genomic hotspot of parallel differentiation could have evolved and become associated both with space and with a patchy environment in a single study system.

Subject(s)

Gene Flow , Genome , Hybridization, Genetic , Reproductive Isolation , Smegmamorpha/genetics , Animals , Biological Evolution , Europe

8.

OrthoMaM v10: Scaling-Up Orthologous Coding Sequence and Exon Alignments with More than One Hundred Mammalian Genomes.

Scornavacca, Celine; Belkhir, Khalid; Lopez, Jimmy; Dernat, Rémy; Delsuc, Frédéric; Douzery, Emmanuel J P; Ranwez, Vincent.

Mol Biol Evol ; 36(4): 861-862, 2019 04 01.

Article in English | MEDLINE | ID: mdl-30698751

ABSTRACT

We present version 10 of OrthoMaM, a database of orthologous mammalian markers. OrthoMaM is already 11 years old and since the outset it has kept on improving, providing alignments and phylogenetic trees of high-quality computed with state-of-the-art methods on up-to-date data. The main contribution of this version is the increase in the number of taxa: 116 mammalian genomes for 14,509 one-to-one orthologous genes. This has been made possible by the combination of genomic data deposited in Ensembl complemented by additional good-quality genomes only available in NCBI. Version 10 users will benefit from pipeline improvements and a completely redesigned web-interface.

Subject(s)

Databases, Genetic , Genome , Mammals/genetics , Phylogeny , Sequence Alignment , Animals

9.

A software tool 'CroCo' detects pervasive cross-species contamination in next generation sequencing data.

Simion, Paul; Belkhir, Khalid; François, Clémentine; Veyssier, Julien; Rink, Jochen C; Manuel, Michaël; Philippe, Hervé; Telford, Maximilian J.

BMC Biol ; 16(1): 28, 2018 03 05.

Article in English | MEDLINE | ID: mdl-29506533

ABSTRACT

BACKGROUND: Multiple RNA samples are frequently processed together and often mixed before multiplex sequencing in the same sequencing run. While different samples can be separated post sequencing using sample barcodes, the possibility of cross contamination between biological samples from different species that have been processed or sequenced in parallel has the potential to be extremely deleterious for downstream analyses. RESULTS: We present CroCo, a software package for identifying and removing such cross contaminants from assembled transcriptomes. Using multiple, recently published sequence datasets, we show that cross contamination is consistently present at varying levels in real data. Using real and simulated data, we demonstrate that CroCo detects contaminants efficiently and correctly. Using a real example from a molecular phylogenetic dataset, we show that contaminants, if not eliminated, can have a decisive, deleterious impact on downstream comparative analyses. CONCLUSIONS: Cross contamination is pervasive in new and published datasets and, if undetected, can have serious deleterious effects on downstream analyses. CroCo is a database-independent, multi-platform tool, designed for ease of use, that efficiently and accurately detects and removes cross contamination in assembled transcriptomes to avoid these problems. We suggest that the use of CroCo should become a standard cleaning step when processing multiple samples for transcriptome sequencing.

Subject(s)

Computational Biology/standards , Databases, Genetic/standards , High-Throughput Nucleotide Sequencing/standards , Phylogeny , RNA, Messenger/genetics , Software/standards , Animals , Computational Biology/methods , Gene Expression Profiling/methods , Gene Expression Profiling/standards , High-Throughput Nucleotide Sequencing/methods , Hydrozoa , RNA, Messenger/analysis , Species Specificity

10.

Whole exome sequencing of wild-derived inbred strains of mice improves power to link phenotype and genotype.

Chang, Peter L; Kopania, Emily; Keeble, Sara; Sarver, Brice A J; Larson, Erica; Orth, Annie; Belkhir, Khalid; Boursot, Pierre; Bonhomme, François; Good, Jeffrey M; Dean, Matthew D.

Mamm Genome ; 28(9-10): 416-425, 2017 Oct.

Article in English | MEDLINE | ID: mdl-28819774

ABSTRACT

The house mouse is a powerful model to dissect the genetic basis of phenotypic variation, and serves as a model to study human diseases. Despite a wealth of discoveries, most classical laboratory strains have captured only a small fraction of genetic variation known to segregate in their wild progenitors, and existing strains are often related to each other in complex ways. Inbred strains of mice independently derived from natural populations have the potential to increase power in genetic studies with the addition of novel genetic variation. Here, we perform exome-enrichment and high-throughput sequencing (~8× coverage) of 26 wild-derived strains known in the mouse research community as the "Montpellier strains." We identified 1.46 million SNPs in our dataset, approximately 19% of which have not been detected from other inbred strains. This novel genetic variation is expected to contribute to phenotypic variation, as they include 18,496 nonsynonymous variants and 262 early stop codons. Simulations demonstrate that the higher density of genetic variation in the Montpellier strains provides increased power for quantitative genetic studies. Inasmuch as the power to connect genotype to phenotype depends on genetic variation, it is important to incorporate these additional genetic strains into future research programs.

Subject(s)

Animals, Wild/genetics , Exome Sequencing , Genetic Variation/genetics , Genotype , Mice, Inbred Strains/genetics , Phenotype , Animals , Codon, Terminator , Computer Simulation , Crosses, Genetic , Female , High-Throughput Nucleotide Sequencing , Mice , Mice, Inbred Strains/classification , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA

11.

The ace-1 Locus Is Amplified in All Resistant Anopheles gambiae Mosquitoes: Fitness Consequences of Homogeneous and Heterogeneous Duplications.

Assogba, Benoît S; Milesi, Pascal; Djogbénou, Luc S; Berthomieu, Arnaud; Makoundou, Patrick; Baba-Moussa, Lamine S; Fiston-Lavier, Anna-Sophie; Belkhir, Khalid; Labbé, Pierrick; Weill, Mylène.

PLoS Biol ; 14(12): e2000618, 2016 Dec.

Article in English | MEDLINE | ID: mdl-27918584

ABSTRACT

Gene copy-number variations are widespread in natural populations, but investigating their phenotypic consequences requires contemporary duplications under selection. Such duplications have been found at the ace-1 locus (encoding the organophosphate and carbamate insecticides' target) in the mosquito Anopheles gambiae (the major malaria vector); recent studies have revealed their intriguing complexity, consistent with the involvement of various numbers and types (susceptible or resistant to insecticide) of copies. We used an integrative approach, from genome to phenotype level, to investigate the influence of duplication architecture and gene-dosage on mosquito fitness. We found that both heterogeneous (i.e., one susceptible and one resistant ace-1 copy) and homogeneous (i.e., identical resistant copies) duplications segregated in field populations. The number of copies in homogeneous duplications was variable and positively correlated with acetylcholinesterase activity and resistance level. Determining the genomic structure of the duplicated region revealed that, in both types of duplication, ace-1 and 11 other genes formed tandem 203kb amplicons. We developed a diagnostic test for duplications, which showed that ace-1 was amplified in all 173 resistant mosquitoes analyzed (field-collected in several African countries), in heterogeneous or homogeneous duplications. Each type was associated with different fitness trade-offs: heterogeneous duplications conferred an intermediate phenotype (lower resistance and fitness costs), whereas homogeneous duplications tended to increase both resistance and fitness cost, in a complex manner. The type of duplication selected seemed thus to depend on the intensity and distribution of selection pressures. This versatility of trade-offs available through gene duplication highlights the importance of large mutation events in adaptation to environmental variation. This impressive adaptability could have a major impact on vector control in Africa.

Subject(s)

Anopheles/genetics , Gene Duplication , Genes, Insect , Animals , Chromosome Mapping , DNA Copy Number Variations

12.

Local interspecies introgression is the main cause of extreme levels of intraspecific differentiation in mussels.

Fraïsse, Christelle; Belkhir, Khalid; Welch, John J; Bierne, Nicolas.

Mol Ecol ; 25(1): 269-86, 2016 01.

Article in English | MEDLINE | ID: mdl-26137909

ABSTRACT

Structured populations, and replicated zones of contact between species, are an ideal opportunity to study regions of the genome with unusual levels of differentiation; and these can illuminate the genomic architecture of species isolation, and the spread of adaptive alleles across species ranges. Here, we investigated the effects of gene flow on divergence and adaptation in the Mytilus complex of species, including replicated parental populations in quite distant geographical locations. We used target enrichment sequencing of 1269 contigs of a few kb each, including some genes of known function, to infer gene genealogies at a small chromosomal scale. We show that geography is an important determinant of the genomewide patterns of introgression in Mytilus and that gene flow between different species, with contiguous ranges, explained up to half of the intraspecific outliers. This suggests that local introgression is both widespread and tends to affect larger chromosomal regions than purely intraspecific processes. We argue that this situation might be common, and this implies that genome scans should always consider the possibility of introgression from sister species, unsampled differentiated backgrounds, or even extinct relatives, for example Neanderthals in humans. The hypothesis that reticulate evolution over long periods of time contributes widely to adaptation, and to the spatial and genomic reorganization of genetic backgrounds, needs to be more widely considered to make better sense of genome scans.

Subject(s)

Gene Flow , Genetic Speciation , Genetics, Population , Mytilus/genetics , Alleles , Animals , Contig Mapping , Geography , Mytilus/classification , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA

13.

European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation.

Tine, Mbaye; Kuhl, Heiner; Gagnaire, Pierre-Alexandre; Louro, Bruno; Desmarais, Erick; Martins, Rute S T; Hecht, Jochen; Knaust, Florian; Belkhir, Khalid; Klages, Sven; Dieterich, Roland; Stueber, Kurt; Piferrer, Francesc; Guinand, Bruno; Bierne, Nicolas; Volckaert, Filip A M; Bargelloni, Luca; Power, Deborah M; Bonhomme, François; Canario, Adelino V M; Reinhardt, Richard.

Nat Commun ; 5: 5770, 2014 Dec 23.

Article in English | MEDLINE | ID: mdl-25534655

ABSTRACT

The European sea bass (Dicentrarchus labrax) is a temperate-zone euryhaline teleost of prime importance for aquaculture and fisheries. This species is subdivided into two naturally hybridizing lineages, one inhabiting the north-eastern Atlantic Ocean and the other the Mediterranean and Black seas. Here we provide a high-quality chromosome-scale assembly of its genome that shows a high degree of synteny with the more highly derived teleosts. We find expansions of gene families specifically associated with ion and water regulation, highlighting adaptation to variation in salinity. We further generate a genome-wide variation map through RAD-sequencing of Atlantic and Mediterranean populations. We show that variation in local recombination rates strongly influences the genomic landscape of diversity within and differentiation between lineages. Comparing predictions of alternative demographic models to the joint allele-frequency spectrum indicates that genomic islands of differentiation between sea bass lineages were generated by varying rates of introgression across the genome following a period of geographical isolation.

Subject(s)

Adaptation, Physiological , Bass/genetics , Genetic Speciation , Genome , Animals , Atlantic Ocean , Bass/physiology , Chromosome Mapping , Genetic Variation , Molecular Sequence Data , Phylogeny

14.

OrthoMaM v8: a database of orthologous exons and coding sequences for comparative genomics in mammals.

Douzery, Emmanuel J P; Scornavacca, Celine; Romiguier, Jonathan; Belkhir, Khalid; Galtier, Nicolas; Delsuc, Frédéric; Ranwez, Vincent.

Mol Biol Evol ; 31(7): 1923-8, 2014 Jul.

Article in English | MEDLINE | ID: mdl-24723423

ABSTRACT

Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at http://www.orthomam.univ-montp2.fr/.

Subject(s)

Databases, Genetic , Mammals/classification , Mammals/genetics , Animals , Base Sequence , Conserved Sequence , Evolution, Molecular , Exons , Genomics , Humans , Phylogeny , Sequence Alignment , Software , Web Browser

15.

Transcriptome characterisation of the ant Formica exsecta with new insights into the evolution of desaturase genes in social hymenoptera.

Badouin, Hélène; Belkhir, Khalid; Gregson, Emma; Galindo, Juan; Sundström, Liselotte; Martin, Stephen J; Butlin, Roger K; Smadja, Carole M.

PLoS One ; 8(7): e68200, 2013.

Article in English | MEDLINE | ID: mdl-23874539

ABSTRACT

BACKGROUND: Despite the recent sequencing of seven ant genomes, no genomic data are available for the genus Formica, an important group for the study of eusocial traits. We sequenced the transcriptome of the ant Formica exsecta with the 454 FLX Titanium technology from a pooled sample of workers from 70 Finnish colonies. RESULTS: About 1,000,000 reads were obtained from a normalised cDNA library. We compared the assemblers MIRA3.0 and Newbler2.6 and showed that the latter performed better on this dataset due to a new option which is dedicated to improve contig formation in low depth portions of the assemblies. The 29,579 contigs represent 27 Mb. 50% showed similarity with known proteins and 25% could be assigned a category of gene ontology. We found more than 13,000 high-quality single nucleotide polymorphisms. The Δ9 desaturase gene family is an important multigene family involved in chemical communication in insects. We found six Δ9 desaturases in this Formica exsecta transcriptome dataset that were used to reconstruct a maximum-likelihood phylogeny of insect desaturases and to test for signatures of positive selection in this multigene family in ant lineages. We found differences with previous phylogenies of this gene family in ants, and found two clades potentially under positive selection. CONCLUSION: This first transcriptome reference sequence of Formica exsecta provided sequence and polymorphism data that will allow researchers working on Formica ants to develop studies to tackle the genetic basis of eusocial phenotypes. In addition, this study provided some general guidelines for de novo transcriptome assembly that should be useful for future transcriptome sequencing projects. Finally, we found potential signatures of positive selection in some clades of the Δ9 desaturase gene family in ants, which suggest the potential role of sequence divergence and adaptive evolution in shaping the large diversity of chemical cues in social insects.

Subject(s)

Ants/enzymology , Ants/genetics , Evolution, Molecular , Fatty Acid Desaturases/genetics , Genes, Insect/genetics , Social Behavior , Transcriptome/genetics , Animals , Gene Ontology , Likelihood Functions , Molecular Sequence Annotation , Multigene Family , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA

16.

Bio++: efficient extensible libraries and tools for computational molecular evolution.

Guéguen, Laurent; Gaillard, Sylvain; Boussau, Bastien; Gouy, Manolo; Groussin, Mathieu; Rochette, Nicolas C; Bigot, Thomas; Fournier, David; Pouyet, Fanny; Cahais, Vincent; Bernard, Aurélien; Scornavacca, Céline; Nabholz, Benoît; Haudry, Annabelle; Dachary, Loïc; Galtier, Nicolas; Belkhir, Khalid; Dutheil, Julien Y.

Mol Biol Evol ; 30(8): 1745-50, 2013 Aug.

Article in English | MEDLINE | ID: mdl-23699471

ABSTRACT

Efficient algorithms and programs for the analysis of the ever-growing amount of biological sequence data are strongly needed in the genomics era. The pace at which new data and methodologies are generated calls for the use of pre-existing, optimized-yet extensible-code, typically distributed as libraries or packages. This motivated the Bio++ project, aiming at developing a set of C++ libraries for sequence analysis, phylogenetics, population genetics, and molecular evolution. The main attractiveness of Bio++ is the extensibility and reusability of its components through its object-oriented design, without compromising the computer-efficiency of the underlying methods. We present here the second major release of the libraries, which provides an extended set of classes and methods. These extensions notably provide built-in access to sequence databases and new data structures for handling and manipulating sequences from the omics era, such as multiple genome alignments and sequencing reads libraries. More complex models of sequence evolution, such as mixture models and generic n-tuples alphabets, are also included.

Subject(s)

Computational Biology , Evolution, Molecular , Software , Algorithms , Computational Biology/methods , Genomics/methods , Humans , Internet

17.

Reference-free population genomics from next-generation transcriptome data and the vertebrate-invertebrate gap.

Gayral, Philippe; Melo-Ferreira, José; Glémin, Sylvain; Bierne, Nicolas; Carneiro, Miguel; Nabholz, Benoit; Lourenco, Joao M; Alves, Paulo C; Ballenghien, Marion; Faivre, Nicolas; Belkhir, Khalid; Cahais, Vincent; Loire, Etienne; Bernard, Aurélien; Galtier, Nicolas.

PLoS Genet ; 9(4): e1003457, 2013 Apr.

Article in English | MEDLINE | ID: mdl-23593039

ABSTRACT

In animals, the population genomic literature is dominated by two taxa, namely mammals and drosophilids, in which fully sequenced, well-annotated genomes have been available for years. Data from other metazoan phyla are scarce, probably because the vast majority of living species still lack a closely related reference genome. Here we achieve de novo, reference-free population genomic analysis from wild samples in five non-model animal species, based on next-generation sequencing transcriptome data. We introduce a pipe-line for cDNA assembly, read mapping, SNP/genotype calling, and data cleaning, with specific focus on the issue of hidden paralogy detection. In two species for which a reference genome is available, similar results were obtained whether the reference was used or not, demonstrating the robustness of our de novo inferences. The population genomic profile of a hare, a turtle, an oyster, a tunicate, and a termite were found to be intermediate between those of human and Drosophila, indicating that the discordant genomic diversity patterns that have been reported between these two species do not reflect a generalized vertebrate versus invertebrate gap. The genomic average diversity was generally higher in invertebrates than in vertebrates (with the notable exception of termite), in agreement with the notion that population size tends to be larger in the former than in the latter. The non-synonymous to synonymous ratio, however, did not differ significantly between vertebrates and invertebrates, even though it was negatively correlated with genetic diversity within each of the two groups. This study opens promising perspective regarding genome-wide population analyses of non-model organisms and the influence of population size on non-synonymous versus synonymous diversity.

Subject(s)

Drosophila/genetics , Genome, Human , Metagenomics , Transcriptome/genetics , Animals , Base Sequence , Genotype , Hares/genetics , High-Throughput Nucleotide Sequencing , Humans , Invertebrates/genetics , Isoptera/genetics , Ostreidae/genetics , Polymorphism, Single Nucleotide , Turtles/genetics , Urochordata/genetics , Vertebrates/genetics

18.

Patterns and evolution of nucleotide landscapes in seed plants.

Serres-Giardi, Laurana; Belkhir, Khalid; David, Jacques; Glémin, Sylvain.

Plant Cell ; 24(4): 1379-97, 2012 Apr.

Article in English | MEDLINE | ID: mdl-22492812

ABSTRACT

Nucleotide landscapes, which are the way base composition is distributed along a genome, strongly vary among species. The underlying causes of these variations have been much debated. Though mutational bias and selection were initially invoked, GC-biased gene conversion (gBGC), a recombination-associated process favoring the G and C over A and T bases, is increasingly recognized as a major factor. As opposed to vertebrates, evolution of GC content is less well known in plants. Most studies have focused on the GC-poor and homogeneous Arabidopsis thaliana genome and the much more GC-rich and heterogeneous rice (Oryza sativa) genome and have often been generalized as a dicot/monocot dichotomy. This vision is clearly phylogenetically biased and does not allow understanding the mechanisms involved in GC content evolution in plants. To tackle these issues, we used EST data from more than 200 species and provided the most comprehensive description of gene GC content across the seed plant phylogeny so far available. As opposed to the classically assumed dicot/monocot dichotomy, we found continuous variations in GC content from the probably ancestral GC-poor and homogeneous genomes to the more derived GC-rich and highly heterogeneous ones, with several independent enrichment episodes. Our results suggest that gBGC could play a significant role in the evolution of GC content in plant genomes.

Subject(s)

Evolution, Molecular , Nucleotides/genetics , Plants/genetics , Seeds/genetics , Base Composition/genetics , Codon/genetics , Databases, Genetic , Expressed Sequence Tags , Gene Expression Regulation, Plant , Genes, Plant/genetics , Genetic Variation , Phylogeny , RNA, Messenger/genetics , RNA, Messenger/metabolism , Reproducibility of Results , Species Specificity , Statistics, Nonparametric , Transcriptome/genetics

19.

SNP detection from de novo transcriptome sequencing in the bivalve Macoma balthica: marker development for evolutionary studies.

Pante, Eric; Rohfritsch, Audrey; Becquet, Vanessa; Belkhir, Khalid; Bierne, Nicolas; Garcia, Pascale.

PLoS One ; 7(12): e52302, 2012.

Article in English | MEDLINE | ID: mdl-23300636

ABSTRACT

Hybrid zones are noteworthy systems for the study of environmental adaptation to fast-changing environments, as they constitute reservoirs of polymorphism and are key to the maintenance of biodiversity. They can move in relation to climate fluctuations, as temperature can affect both selection and migration, or remain trapped by environmental and physical barriers. There is therefore a very strong incentive to study the dynamics of hybrid zones subjected to climate variations. The infaunal bivalve Macoma balthica emerges as a noteworthy model species, as divergent lineages hybridize, and its native NE Atlantic range is currently contracting to the North. To investigate the dynamics and functioning of hybrid zones in M. balthica, we developed new molecular markers by sequencing the collective transcriptome of 30 individuals. Ten individuals were pooled for each of the three populations sampled at the margins of two hybrid zones. A single 454 run generated 277 Mb from which 17K SNPs were detected. SNP density averaged 1 polymorphic site every 14 to 19 bases, for mitochondrial and nuclear loci, respectively. An [Formula: see text] scan detected high genetic divergence among several hundred SNPs, some of them involved in energetic metabolism, cellular respiration and physiological stress. The high population differentiation, recorded for nuclear-encoded ATP synthase and NADH dehydrogenase as well as most mitochondrial loci, suggests cytonuclear genetic incompatibilities. Results from this study will help pave the way to a high-resolution study of hybrid zone dynamics in M. balthica, and the relative importance of endogenous and exogenous barriers to gene flow in this system.

Subject(s)

Bivalvia/genetics , Evolution, Molecular , Gene Expression Profiling , Genetic Markers/genetics , Polymorphism, Single Nucleotide/genetics , Adaptation, Physiological/genetics , Animals , Bivalvia/physiology , Genetic Loci/genetics , Linkage Disequilibrium/genetics , Molecular Sequence Annotation , Selection, Genetic

20.

Gene flow at major transitional areas in sea bass (Dicentrarchus labrax) and the possible emergence of a hybrid swarm.

Quéré, Nolwenn; Desmarais, Erick; Tsigenopoulos, Costas S; Belkhir, Khalid; Bonhomme, François; Guinand, Bruno.

Ecol Evol ; 2(12): 3061-78, 2012 Dec.

Article in English | MEDLINE | ID: mdl-23301173

ABSTRACT

The population genetic structure of sea bass (Dicentrarchus labrax) along a transect from the Atlantic Ocean (AO) to the Eastern Mediterranean (EM) Sea differs from that of most other marine taxa in this area. Three populations (AO, Western Mediterranean [WM], EM) are recognized today, which were originally two allopatric populations. How two ancestral genetic units have evolved into three distinct units has not been addressed yet. Therefore, to investigate mechanisms that lead to the emergence of the central WM population, its current status, and its connectivity with the two parental populations, we applied 20 nuclear loci that were either gene associated or gene independent. Results confirmed the existence of three distinct gene pools, with higher differentiation at two transitional areas, the Almeria-Oran Front (AOF) and of the Siculo-Tunisian Strait (STS), than within any population. Significant linkage disequilibrium and heterozygote excess indicated that the STS is probably another tension zone, as already described for the AOF. Neutrality tests fail to reveal marker loci that could be driven by selection within or among metapopulations, except for locus DLA0068. Collectively, results support that the central WM population arose by trapping two tensions zones at distinct geographic locations of limited connectivity. Population assignment further revealed that WM individuals were more introgressed than individuals from the other two metapopulations. This suggests that this population might result from hybrid swarming, and was or is still seeded by genes received through the filter of each tension zone.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL