Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
New Phytol ; 239(2): 766-777, 2023 07.
Article in English | MEDLINE | ID: mdl-37212044

ABSTRACT

There is growing evidence that cytonuclear incompatibilities (i.e. disruption of cytonuclear coadaptation) might contribute to the speciation process. In a former study, we described the possible involvement of plastid-nuclear incompatibilities in the reproductive isolation between four lineages of Silene nutans (Caryophyllaceae). Because organellar genomes are usually cotransmitted, we assessed whether the mitochondrial genome could also be involved in the speciation process, knowing that the gynodioecious breeding system of S. nutans is expected to impact the evolutionary dynamics of this genome. Using hybrid capture and high-throughput DNA sequencing, we analyzed diversity patterns in the genic content of the organellar genomes in the four S. nutans lineages. Contrary to the plastid genome, which exhibited a large number of fixed substitutions between lineages, extensive sharing of polymorphisms between lineages was found in the mitochondrial genome. In addition, numerous recombination-like events were detected in the mitochondrial genome, loosening the linkage disequilibrium between the organellar genomes and leading to decoupled evolution. These results suggest that gynodioecy shaped mitochondrial diversity through balancing selection, maintaining ancestral polymorphism and, thus, limiting the involvement of the mitochondrial genome in evolution of hybrid inviability between S. nutans lineages.


Subject(s)
Genome, Mitochondrial , Silene , Silene/genetics , Plant Breeding , Cell Nucleus/genetics , Mitochondria/genetics , Genome, Mitochondrial/genetics , Evolution, Molecular , Phylogeny
2.
Mol Phylogenet Evol ; 169: 107436, 2022 04.
Article in English | MEDLINE | ID: mdl-35131426

ABSTRACT

Early stages of speciation in plants might involve genetic incompatibilities between plastid and nuclear genomes, leading to inter-lineage hybrid breakdown due to the disruption between co-adapted plastid and nuclear genes encoding subunits of the same plastid protein complexes. We tested this hypothesis in Silene nutans, a gynodioecious Caryophyllaceae, where four distinct genetic lineages exhibited strong reproductive isolation among each other, resulting in chlorotic or variegated hybrids. By sequencing the whole gene content of the four plastomes through gene capture, and a large part of the nuclear genes encoding plastid subunits from RNAseq data, we searched for non-synonymous substitutions fixed in each lineage on both genomes. Lineages of S. nutans exhibited a high level of dN/dS ratios for plastid and nuclear genes encoding most plastid complexes, with a strong pattern of coevolution for genes encoding the subunits of ribosome and cytochrome b6/f that could explain the chlorosis of hybrids. Overall, relaxation of selection due to past bottlenecks and positive selection have driven the diversity pattern observed in S. nutans plastid complexes, leading to plastid-nuclear incompatibilities. We discuss the possible role of gynodioecy in the evolutionary dynamics of the plastomes through linked selection.


Subject(s)
Caryophyllaceae , Genome, Plastid , Silene , Caryophyllaceae/genetics , Evolution, Molecular , Phylogeny , Plastids/genetics , Reproductive Isolation , Silene/genetics
3.
Bioinformatics ; 36(12): 3894-3896, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32315402

ABSTRACT

MOTIVATION: Genome assembly is increasingly performed on long, uncorrected reads. Assembly quality may be degraded due to unfiltered chimeric reads; also, the storage of all read overlaps can take up to terabytes of disk space. RESULTS: We introduce two tools: yacrd for chimera removal and read scrubbing, and fpa for filtering out spurious overlaps. We show that yacrd results in higher-quality assemblies and is one hundred times faster than the best available alternative. AVAILABILITY AND IMPLEMENTATION: https://github.com/natir/yacrd and https://github.com/natir/fpa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing , Software , Sequence Analysis, DNA
4.
J Clin Virol ; 122: 104206, 2020 01.
Article in English | MEDLINE | ID: mdl-31783264

ABSTRACT

BACKGROUND: While respiratory viral infections are recognized as a frequent cause of illness in hematopoietic stem cell transplantation (HSCT) recipients, HCoV-OC43 infections have rarely been investigated as healthcare-associated infections in this population. OBJECTIVES: In this report, HCoV-OC43 isolates collected from HSCT patients were retrospectively characterized to identify potential clusters of infection that may stand for a hospital transmission. STUDY DESIGN: Whole-genome and S gene sequences were obtained from nasal swabs using next-generation sequencing and phylogenetic trees were constructed. Similar identity matrix and determination of the most common ancestor were used to compare clusters of patient's sequences. Amino acids substitutions were analysed. RESULTS: Genotypes B, E, F and G were identified. Two clusters of patients were defined from chronological data and phylogenetic trees. Analyses of amino acids substitutions of the S protein sequences identified substitutions specific for genotype F strains circulating among European people. CONCLUSIONS: HCoV-OC43 may be implicated in healthcare-associated infections.


Subject(s)
Coronavirus Infections/virology , Coronavirus OC43, Human/genetics , Cross Infection/virology , Genome, Viral/genetics , Adult , Aged , Coronavirus Infections/epidemiology , Coronavirus Infections/transmission , Coronavirus OC43, Human/isolation & purification , Coronavirus OC43, Human/physiology , Cross Infection/epidemiology , Cross Infection/transmission , Europe/epidemiology , Female , Genotype , Hematopoietic Stem Cell Transplantation , Humans , Male , Middle Aged , Molecular Epidemiology , Phylogeny , Retrospective Studies , Whole Genome Sequencing , Young Adult
5.
Int J Mol Sci ; 20(19)2019 Sep 26.
Article in English | MEDLINE | ID: mdl-31561566

ABSTRACT

Mitochondrial genomes (mitogenomes) in higher plants can induce cytoplasmic male sterility and be somehow involved in nuclear-cytoplasmic interactions affecting plant growth and agronomic performance. They are larger and more complex than in other eukaryotes, due to their recombinogenic nature. For most plants, the mitochondrial DNA (mtDNA) can be represented as a single circular chromosome, the so-called master molecule, which includes repeated sequences that recombine frequently, generating sub-genomic molecules in various proportions. Based on the relevance of the potato crop worldwide, herewith we report the complete mtDNA sequence of two S. tuberosum cultivars, namely Cicero and Désirée, and a comprehensive study of its expression, based on high-coverage RNA sequencing data. We found that the potato mitogenome has a multi-partite architecture, divided in at least three independent molecules that according to our data should behave as autonomous chromosomes. Inter-cultivar variability was null, while comparative analyses with other species of the Solanaceae family allowed the investigation of the evolutionary history of their mitogenomes. The RNA-seq data revealed peculiarities in transcriptional and post-transcriptional processing of mRNAs. These included co-transcription of genes with open reading frames that are probably expressed, methylation of an rRNA at a position that should impact translation efficiency and extensive RNA editing, with a high proportion of partial editing implying frequent mis-targeting by the editing machinery.


Subject(s)
Gene Expression Profiling , Genome, Mitochondrial , Genomics , Solanum tuberosum/genetics , Whole Genome Sequencing , Amino Acid Sequence , Genomics/methods , Open Reading Frames , Phylogeny , RNA Editing
6.
Virology ; 531: 141-148, 2019 05.
Article in English | MEDLINE | ID: mdl-30878524

ABSTRACT

Genome sequencing of virus has become a useful tool for better understanding of virus pathogenicity and epidemiological surveillance. Obtaining virus genome sequence directly from clinical samples is still a challenging task due to the low load of virus genetic material compared to the host DNA, and to the difficulty to get an accurate genome assembly. Here we introduce a complete sequencing and analyzing protocol called V-ASAP for Virus Amplicon Sequencing Assembly Pipeline. Our protocol is able to generate the viral dominant genome sequence starting from clinical samples. It is based on a multiplex PCR amplicon sequencing coupled with a reference-free analytical pipeline. This protocol was applied to 11 clinical samples infected with coronavirus OC43 (HcoV-OC43), and led to seven complete and two nearly complete genome assemblies. The protocol introduced here is shown to be robust, to produce a reliable sequence, and could be applied to other virus.


Subject(s)
Coronavirus Infections/virology , Coronavirus OC43, Human/genetics , Genome, Viral , Whole Genome Sequencing/methods , Coronavirus OC43, Human/classification , Coronavirus OC43, Human/isolation & purification , Humans , Multiplex Polymerase Chain Reaction
7.
Bioinformatics ; 35(21): 4239-4246, 2019 11 01.
Article in English | MEDLINE | ID: mdl-30918948

ABSTRACT

MOTIVATION: Long-read genome assembly tools are expected to reconstruct bacterial genomes nearly perfectly; however, they still produce fragmented assemblies in some cases. It would be beneficial to understand whether these cases are intrinsically impossible to resolve, or if assemblers are at fault, implying that genomes could be refined or even finished with little to no additional experimental cost. RESULTS: We propose a set of computational techniques to assist inspection of fragmented bacterial genome assemblies, through careful analysis of assembly graphs. By finding paths of overlapping raw reads between pairs of contigs, we recover potential short-range connections between contigs that were lost during the assembly process. We show that our procedure recovers 45% of missing contig adjacencies in fragmented Canu assemblies, on samples from the NCTC bacterial sequencing project. We also observe that a simple procedure based on enumerating weighted Hamiltonian cycles can suggest likely contig orderings. In our tests, the correct contig order is ranked first in half of the cases and within the top-three predictions in nearly all evaluated cases, providing a direction for finishing fragmented long-read assemblies. AVAILABILITY AND IMPLEMENTATION: https://gitlab.inria.fr/pmarijon/knot . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome, Bacterial , High-Throughput Nucleotide Sequencing , Algorithms , Bacteria , Sequence Analysis, DNA , Software
8.
BMC Genomics ; 17(Suppl 10): 786, 2016 11 11.
Article in English | MEDLINE | ID: mdl-28185551

ABSTRACT

BACKGROUND: Transcriptome reconstruction, defined as the identification of all protein isoforms that may be expressed by a gene, is a notably difficult computational task. With real data, the best methods based on RNA-seq data identify barely 21 % of the expressed transcripts. While waiting for algorithms and sequencing techniques to improve - as has been strongly suggested in the literature - it is important to evaluate assisted transcriptome prediction; this is the question of how alternative transcription in one species performs as a predictor of protein isoforms in another relatively close species. Most evidence-based gene predictors use transcripts from other species to annotate a genome, but the predictive power of procedures that use exclusively transcripts from external species has never been quantified. The cornerstone of such an evaluation is the correct identification of pairs of transcripts with the same splicing patterns, called splicing orthologs. RESULTS: We propose a rigorous procedural definition of splicing orthologs, based on the identification of all ortholog pairs of splicing sites in the nucleotide sequences, and alignments at the protein level. Using our definition, we compared 24 382 human transcripts and 17 909 mouse transcripts from the highly curated CCDS database, and identified 11 122 splicing orthologs. In prediction mode, we show that human transcripts can be used to infer over 62 % of mouse protein isoforms. When restricting the predictions to transcripts known eight years ago, the percentage grows to 74 %. Using CCDS timestamped releases, we also analyze the evolution of the number of splicing orthologs over the last decade. CONCLUSIONS: Alternative splicing is now recognized to play a major role in the protein diversity of eukaryotic organisms, but definitions of spliced isoform orthologs are still approximate. Here we propose a definition adapted to the subtle variations of conserved alternative splicing sites, and use it to validate numerous accurate orthologous isoform predictions.


Subject(s)
Algorithms , Proteins/genetics , Transcriptome , Alternative Splicing , Animals , Computational Biology , Exons , Humans , Mice , Protein Isoforms/chemistry , Protein Isoforms/genetics , Protein Isoforms/metabolism , Proteins/chemistry , Proteins/metabolism , RNA/chemistry , RNA/genetics , RNA/metabolism
9.
BMC Genomics ; 16 Suppl 5: S6, 2015.
Article in English | MEDLINE | ID: mdl-26040958

ABSTRACT

BACKGROUND: In the context of ancestral gene order reconstruction from extant genomes, there exist two main computational approaches: rearrangement-based, and homology-based methods. The rearrangement-based methods consist in minimizing a total rearrangement distance on the branches of a species tree. The homology-based methods consist in the detection of a set of potential ancestral contiguity features, followed by the assembling of these features into Contiguous Ancestral Regions (CARs). RESULTS: In this paper, we present a new homology-based method that uses a progressive approach for both the detection and the assembling of ancestral contiguity features into CARs. The method is based on detecting a set of potential ancestral adjacencies iteratively using the current set of CARs at each step, and constructing CARs progressively using a 2-phase assembling method. CONCLUSION: We show the usefulness of the method through a reconstruction of the boreoeutherian ancestral gene order, and a comparison with three other homology-based methods: AnGeS, InferCARs and GapAdj. The program, written in Python, and the dataset used in this paper are available at http://bioinfo.lifl.fr/procars/.


Subject(s)
Animal Population Groups/genetics , Computational Biology/methods , Genome/genetics , Genomics/methods , Population Groups/genetics , Algorithms , Animals , Evolution, Molecular , Humans , Models, Genetic , Phylogeny
10.
Memory ; 21(8): 945-68, 2013.
Article in English | MEDLINE | ID: mdl-23485108

ABSTRACT

We propose a new method based on an algorithm usually dedicated to DNA sequence alignment in order to both reliably score short-term memory performance on immediate serial-recall tasks and analyse retention-error patterns. There can be considerable confusion on how performance on immediate serial list recall tasks is scored, especially when the to-be-remembered items are sampled with replacement. We discuss the utility of sequence-alignment algorithms to compare the stimuli to the participants' responses. The idea is that deletion, substitution, translocation, and insertion errors, which are typical in DNA, are also typical putative errors in short-term memory (respectively omission, confusion, permutation, and intrusion errors). We analyse four data sets in which alphanumeric lists included a few (or many) repetitions. After examining the method on two simple data sets, we show that sequence alignment offers 1) a compelling method for measuring capacity in terms of chunks when many regularities are introduced in the material (third data set) and 2) a reliable estimator of individual differences in short-term memory capacity. This study illustrates the difficulty of arriving at a good measure of short-term memory performance, and also attempts to characterise the primary factors underpinning remembering and forgetting.


Subject(s)
Memory, Short-Term , Mental Recall , Psychomotor Performance , Serial Learning , Algorithms , Computational Biology/methods , DNA/chemistry , Data Interpretation, Statistical , Female , Humans , Individuality , Male , Photic Stimulation , Young Adult
11.
BMC Bioinformatics ; 12 Suppl 9: S20, 2011 Oct 05.
Article in English | MEDLINE | ID: mdl-22152053

ABSTRACT

BACKGROUND: Segmental duplications in genomes have been studied for many years. Recently, several studies have highlighted a biological phenomenon called breakpoint-duplication that apparently associates a significant proportion of segmental duplications in Mammals, and the Drosophila species group, to breakpoints in rearrangement events. RESULTS: In this paper, we introduce and study a combinatorial problem, inspired from the breakpoint-duplication phenomenon, called the Genome Dedoubling Problem. It consists of finding a minimum length rearrangement scenario required to transform a genome with duplicated segments into a non-duplicated genome such that duplications are caused by rearrangement breakpoints. We show that the problem, in the Double-Cut-and-Join (DCJ) and the reversal rearrangement models, can be reduced to an APX-complete problem, and we provide algorithms for the Genome Dedoubling Problem with 2-approximable parts. We apply the methods for the reconstruction of a non-duplicated ancestor of Drosophila yakuba. CONCLUSIONS: We present the Genome Dedoubling Problem, and describe two algorithms solving the problem in the DCJ model, and the reversal model. The usefulness of the problems and the methods are showed through an application to real Drosophila data.


Subject(s)
Evolution, Molecular , Genomics/methods , Segmental Duplications, Genomic , Algorithms , Animals , Drosophila/genetics , Models, Genetic
12.
Nucleic Acids Res ; 38(Web Server issue): W286-92, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20522509

ABSTRACT

DNA-binding transcription factors (TFs) play a central role in transcription regulation, and computational approaches that help in elucidating complex mechanisms governing this basic biological process are of great use. In this perspective, we present the TFM-Explorer web server that is a toolbox to identify putative TF binding sites within a set of upstream regulatory sequences of genes sharing some regulatory mechanisms. TFM-Explorer finds local regions showing overrepresentation of binding sites. Accepted organisms are human, mouse, rat, chicken and drosophila. The server employs a number of features to help users to analyze their data: visualization of selected binding sites on genomic sequences, and selection of cis-regulatory modules. TFM-Explorer is available at http://bioinfo.lifl.fr/TFM.


Subject(s)
Regulatory Elements, Transcriptional , Software , Transcription Factors/metabolism , Animals , Binding Sites , Brain/metabolism , Data Mining , Genomics/methods , Humans , Internet , Mice , Muscles/metabolism , Rats , Skin/metabolism
13.
BMC Genomics ; 11: 233, 2010 Apr 09.
Article in English | MEDLINE | ID: mdl-20380689

ABSTRACT

BACKGROUND: Despite their monophyletic origin, animal and plant mitochondrial genomes have been described as exhibiting different modes of evolution. Indeed, plant mitochondrial genomes feature a larger size, a lower mutation rate and more rearrangements than their animal counterparts. Gene order variation in animal mitochondrial genomes is often described as being due to translocation and inversion events, but tandem duplication followed by loss has also been proposed as an alternative process. In plant mitochondrial genomes, at the species level, gene shuffling and duplicate occurrence are such that no clear phylogeny has ever been identified, when considering genome structure variation. RESULTS: In this study we analyzed the whole sequences of eight mitochondrial genomes from maize and teosintes in order to comprehend the events that led to their structural features, i.e. the order of genes, tRNAs, rRNAs, ORFs, pseudogenes and non-coding sequences shared by all mitogenomes and duplicate occurrences. We suggest a tandem duplication model similar to the one described in animals, except that some duplicates can remain. This model enabled us to develop a manual method to deal with duplicates, a recurrent problem in rearrangement analyses. The phylogenetic tree exclusively based on rearrangement and duplication events is congruent with the tree based on sequence polymorphism, validating our evolution model. CONCLUSIONS: This study suggests more similarity than usually reported between plant and animal mitochondrial genomes in their mode of evolution. Further work will consist of developing new tools in order to automatically look for signatures of tandem duplication events in other plant mitogenomes and evaluate the occurrence of this process on a larger scale.


Subject(s)
Gene Duplication , Genome, Mitochondrial , Zea mays/genetics , Genome, Plant , Phylogeny
14.
Algorithms Mol Biol ; 2: 15, 2007 Dec 11.
Article in English | MEDLINE | ID: mdl-18072973

ABSTRACT

BACKGROUND: Position Weight Matrices (PWMs) are probabilistic representations of signals in sequences. They are widely used to model approximate patterns in DNA or in protein sequences. The usage of PWMs needs as a prerequisite to knowing the statistical significance of a word according to its score. This is done by defining the P-value of a score, which is the probability that the background model can achieve a score larger than or equal to the observed value. This gives rise to the following problem: Given a P-value, find the corresponding score threshold. Existing methods rely on dynamic programming or probability generating functions. For many examples of PWMs, they fail to give accurate results in a reasonable amount of time. RESULTS: The contribution of this paper is two fold. First, we study the theoretical complexity of the problem, and we prove that it is NP-hard. Then, we describe a novel algorithm that solves the P-value problem efficiently. The main idea is to use a series of discretized score distributions that improves the final result step by step until some convergence criterion is met. Moreover, the algorithm is capable of calculating the exact P-value without any error, even for matrices with non-integer coefficient values. The same approach is also used to devise an accurate algorithm for the reverse problem: finding the P-value for a given score. Both methods are implemented in a software called TFM-PVALUE, that is freely available. CONCLUSION: We have tested TFM-PVALUE on a large set of PWMs representing transcription factor binding sites. Experimental results show that it achieves better performance in terms of computational time and precision than existing tools.

15.
J Mol Evol ; 60(2): 257-67, 2005 Feb.
Article in English | MEDLINE | ID: mdl-15785854

ABSTRACT

Single-celled apicomplexan parasites are known to cause major diseases in humans and animals including malaria, toxoplasmosis, and coccidiosis. The presence of apicoplasts with the remnant of a plastid-like DNA argues that these parasites evolved from photosynthetic ancestors possibly related to the dinoflagellates. Toxoplasma gondii displays amylopectin-like polymers within the cytoplasm of the dormant brain cysts. Here we report a detailed structural and comparative analysis of the Toxoplasma gondii, green alga Chlamydomonas reinhardtii, and dinoflagellate Crypthecodinium cohnii storage polysaccharides. We show Toxoplasma gondii amylopectin to be similar to the semicrystalline floridean starch accumulated by red algae. Unlike green plants or algae, the nuclear DNA sequences as well as biochemical and phylogenetic analysis argue that the Toxoplasma gondii amylopectin pathway has evolved from a totally different UDP-glucose-based metabolism similar to that of the floridean starch accumulating red alga Cyanidioschyzon merolae and, to a lesser extent, to those of glycogen storing animals or fungi. In both red algae and apicomplexan parasites, isoamylase and glucan-water dikinase sequences are proposed to explain the appearance of semicrystalline starch-like polymers. Our results have built a case for the separate evolution of semicrystalline storage polysaccharides upon acquisition of photosynthesis in eukaryotes.


Subject(s)
Evolution, Molecular , Polysaccharides/genetics , Polysaccharides/metabolism , Rhodophyta/genetics , Rhodophyta/metabolism , Toxoplasma/genetics , Toxoplasma/metabolism , Amino Acid Sequence , Animals , Chlamydomonas reinhardtii/genetics , Chlamydomonas reinhardtii/metabolism , Chlamydomonas reinhardtii/ultrastructure , Crystallization , Dinoflagellida/genetics , Dinoflagellida/metabolism , Dinoflagellida/ultrastructure , Glycogen Debranching Enzyme System/genetics , Humans , Microscopy, Electron , Phylogeny , Polysaccharides/chemistry , Rhodophyta/ultrastructure , Sequence Homology, Amino Acid , Toxoplasma/pathogenicity , Toxoplasma/ultrastructure
SELECTION OF CITATIONS
SEARCH DETAIL
...