Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
Front Plant Sci ; 15: 1339594, 2024.
Article in English | MEDLINE | ID: mdl-38601302

ABSTRACT

The tree Eucalyptus camaldulensis is a ubiquitous member of the Eucalyptus genus, which includes several hundred species. Despite the extensive sequencing and assembly of nuclear genomes from various eucalypts, the genus has only one fully annotated and complete mitochondrial genome (mitogenome). Plant mitochondria are characterized by dynamic genomic rearrangements, facilitated by repeat content, a feature that has hindered the assembly of plant mitogenomes. This complexity is evident in the paucity of available mitogenomes. This study, to the best of our knowledge, presents the first E. camaldulensis mitogenome. Our findings suggest the presence of multiple isomeric forms of the E. camaldulensis mitogenome and provide novel insights into minor rearrangements triggered by nested repeat sequences. A comparative sequence analysis of the E. camaldulensis and E. grandis mitogenomes unveils evolutionary changes between the two genomes. A significant divergence is the evolution of a large repeat sequence, which may have contributed to the differences observed between the two genomes. The largest repeat sequences in the E. camaldulensis mitogenome align well with significant yet unexplained structural variations in the E. grandis mitogenome, highlighting the adaptability of repeat sequences in plant mitogenomes.

2.
Genome Med ; 15(1): 114, 2023 Dec 14.
Article in English | MEDLINE | ID: mdl-38098057

ABSTRACT

BACKGROUND: Long-read whole genome sequencing (lrWGS) has the potential to address the technical limitations of exome sequencing in ways not possible by short-read WGS. However, its utility in autosomal recessive Mendelian diseases is largely unknown. METHODS: In a cohort of 34 families in which the suspected autosomal recessive diseases remained undiagnosed by exome sequencing, lrWGS was performed on the Pacific Bioscience Sequel IIe platform. RESULTS: Likely causal variants were identified in 13 (38%) of the cohort. These include (1) a homozygous splicing SV in TYMS as a novel candidate gene for lethal neonatal lactic acidosis, (2) a homozygous non-coding SV that we propose impacts STK25 expression and causes a novel neurodevelopmental disorder, (3) a compound heterozygous SV in RP1L1 with complex inheritance pattern in a family with inherited retinal disease, (4) homozygous deep intronic variants in LEMD2 and SNAP91 as novel candidate genes for neurodevelopmental disorders in two families, and (5) a promoter SNV in SLC4A4 causing non-syndromic band keratopathy. Surprisingly, we also encountered causal variants that could have been identified by short-read exome sequencing in 7 families. The latter highlight scenarios that are especially challenging at the interpretation level. CONCLUSIONS: Our data highlight the continued need to address the interpretation challenges in parallel with efforts to improve the sequencing technology itself. We propose a path forward for the implementation of lrWGS sequencing in the setting of autosomal recessive diseases in a way that maximizes its utility.


Subject(s)
Exome , Inheritance Patterns , Infant, Newborn , Humans , Genes, Recessive , Mutation , Exome Sequencing , Pedigree , Eye Proteins/genetics , Membrane Proteins/genetics , Nuclear Proteins/genetics , Protein Serine-Threonine Kinases/genetics , Intracellular Signaling Peptides and Proteins/genetics
3.
Genome Biol ; 22(1): 256, 2021 09 03.
Article in English | MEDLINE | ID: mdl-34479618

ABSTRACT

Currently, different sequencing platforms are used to generate plant genomes and no workflow has been properly developed to optimize time, cost, and assembly quality. We present LeafGo, a complete de novo plant genome workflow, that starts from tissue and produces genomes with modest laboratory and bioinformatic resources in approximately 7 days and using one long-read sequencing technology. LeafGo is optimized with ten different plant species, three of which are used to generate high-quality chromosome-level assemblies without any scaffolding technologies. Finally, we report the diploid genomes of Eucalyptus rudis and E. camaldulensis and the allotetraploid genome of Arachis hypogaea.


Subject(s)
Genome, Plant , High-Throughput Nucleotide Sequencing/methods , Plant Leaves/genetics , Software , Arachis/genetics , DNA, Plant/genetics , DNA, Plant/isolation & purification , Diploidy , Species Specificity , Tetraploidy , Time Factors
4.
BMC Med Genomics ; 13(1): 103, 2020 07 17.
Article in English | MEDLINE | ID: mdl-32680510

ABSTRACT

BACKGROUND: Testing strategies is crucial for genetics clinics and testing laboratories. In this study, we tried to compare the hit rate between solo and trio and trio plus testing and between trio and sibship testing. Finally, we studied the impact of extended family analysis, mainly in complex and unsolved cases. METHODS: Three cohorts were used for this analysis: one cohort to assess the hit rate between solo, trio and trio plus testing, another cohort to examine the impact of the testing strategy of sibship genome vs trio-based analysis, and a third cohort to test the impact of an extended family analysis of up to eight family members to lower the number of candidate variants. RESULTS: The hit rates in solo, trio and trio plus testing were 39, 40, and 41%, respectively. The total number of candidate variants in the sibship testing strategy was 117 variants compared to 59 variants in the trio-based analysis. We noticed that the average number of coding candidate variants in trio-based analysis was 1192 variants and 26,454 noncoding variants, and this number was lowered by 50-75% after adding additional family members, with up to two coding and 66 noncoding homozygous variants only, in families with eight family members. CONCLUSION: There was no difference in the hit rate between solo and extended family members. Trio-based analysis was a better approach than sibship testing, even in a consanguineous population. Finally, each additional family member helped to narrow down the number of variants by 50-75%. Our findings could help clinicians, researchers and testing laboratories select the most cost-effective and appropriate sequencing approach for their patients. Furthermore, using extended family analysis is a very useful tool for complex cases with novel genes.


Subject(s)
Consanguinity , Exome , Family , Genetic Markers , Genetic Predisposition to Disease , Genetic Testing , Genetic Variation , Adult , Child , Female , Humans , Male , Retrospective Studies , Exome Sequencing
5.
G3 (Bethesda) ; 10(4): 1193-1196, 2020 04 09.
Article in English | MEDLINE | ID: mdl-32041730

ABSTRACT

We propose LongQC as an easy and automated quality control tool for genomic datasets generated by third generation sequencing (TGS) technologies such as Oxford Nanopore technologies (ONT) and SMRT sequencing from Pacific Bioscience (PacBio). Key statistics were optimized for long read data, and LongQC covers all major TGS platforms. LongQC processes and visualizes those statistics automatically and quickly.


Subject(s)
High-Throughput Nucleotide Sequencing , Nanopores , Quality Control , Sequence Analysis, DNA
6.
Sci Rep ; 9(1): 12603, 2019 08 30.
Article in English | MEDLINE | ID: mdl-31471543

ABSTRACT

Proteins often work as oligomers or multimers in vivo. Therefore, elucidating their oligomeric or multimeric form (quaternary structure) is crucially important to ascertain their function. X-ray crystal structures of numerous proteins have been accumulated, providing information related to their biological units. Extracting information of biological units from protein crystal structures represents a meaningful task for modern biology. Nevertheless, although many methods have been proposed for identifying biological units appearing in protein crystal structures, it is difficult to distinguish biological protein-protein interfaces from crystallographic ones. Therefore, our simple but highly accurate classifier was developed to infer biological units in protein crystal structures using large amounts of protein sequence information and a modern contact prediction method to exploit covariation signals (CSs) in proteins. We demonstrate that our proposed method is promising even for weak signals of biological interfaces. We also discuss the relation between classification accuracy and conservation of biological units, and illustrate how the selection of sequences included in multiple sequence alignments as sources for obtaining CSs affects the results. With increased amounts of sequence data, the proposed method is expected to become increasingly useful.


Subject(s)
Protein Conformation , Protein Interaction Domains and Motifs , Protein Structure, Quaternary , Proteins/chemistry , Algorithms , Amino Acid Sequence/genetics , Computational Biology , Crystallography, X-Ray , Databases, Protein , Protein Binding/genetics , Protein Multimerization/genetics , Proteins/classification , Proteins/genetics , Sequence Alignment
7.
Proteins ; 86 Suppl 1: 274-282, 2018 03.
Article in English | MEDLINE | ID: mdl-29178285

ABSTRACT

Proteins often exist as their multimeric forms when they function as so-called biological assemblies consisting of the specific number and arrangement of protein subunits. Consequently, elucidating biological assemblies is necessary to improve understanding of protein function. Template-Based Modeling (TBM), based on known protein structures, has been used widely for protein structure prediction. Actually, TBM has become an increasingly useful approach in recent years because of the increased amounts of information related to protein amino acid sequences and three-dimensional structures. An apparently similar situation exists for biological assembly structure prediction as protein complex structures in the PDB increase, although the inference of biological assemblies is not a trivial task. Many methods using TBM, including ours, have been developed for protein structure prediction. Using enhanced profile-profile alignments, we participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12), as the FONT team (Group # 480). Herein, we present experimental procedures and results of retrospective analyses using our approach for the Quaternary Structure Prediction category of CASP12. We performed profile-profile alignments of several types, based on FORTE, our profile-profile alignment algorithm, to identify suitable templates. Results show that these alignment results enable us to find templates in almost all possible cases. Moreover, we have come to understand the necessity of developing a model selection method that provides improved accuracy. Results also demonstrate that, to some extent, finding templates of protein complexes is useful even for MEDIUM and HARD assembly prediction.


Subject(s)
Computational Biology/methods , Databases, Protein , Models, Molecular , Protein Structure, Quaternary , Proteins/chemistry , Sequence Alignment/methods , Algorithms , Humans , Protein Subunits , Sequence Analysis, Protein
8.
Acta Crystallogr D Struct Biol ; 73(Pt 9): 757-766, 2017 Sep 01.
Article in English | MEDLINE | ID: mdl-28876239

ABSTRACT

An alternative rational approach to improve protein crystals by using single-site mutation of surface residues is proposed based on the results of a statistical analysis using a compiled data set of 918 independent crystal structures, thereby reflecting not only the entropic effect but also other effects upon protein crystallization. This analysis reveals a clear difference in the crystal-packing propensity of amino acids depending on the secondary-structural class. To verify this result, a systematic crystallization experiment was performed with the biotin carboxyl carrier protein from Pyrococcus horikoshii OT3 (PhBCCP). Six single-site mutations were examined: Ala138 on the surface of a ß-sheet was mutated to Ile, Tyr, Arg, Gln, Val and Lys. In agreement with prediction, it was observed that the two mutants (A138I and A138Y) harbouring the residues with the highest crystal-packing propensities for ß-sheet at position 138 provided better crystallization scores relative to those of other constructs, including the wild type, and that the crystal-packing propensity for ß-sheet provided the best correlation with the ratio of obtaining crystals. Two new crystal forms of these mutants were obtained that diffracted to high resolution, generating novel packing interfaces with the mutated residues (Ile/Tyr). The mutations introduced did not affect the overall structures, indicating that a ß-sheet can accommodate a successful mutation if it is carefully selected so as to avoid intramolecular steric hindrance. A significant negative correlation between the ratio of obtaining amorphous precipitate and the crystal-packing propensity was also found.


Subject(s)
Acetyl-CoA Carboxylase/chemistry , Archaeal Proteins/chemistry , Pyrococcus horikoshii/chemistry , Acetyl-CoA Carboxylase/genetics , Amino Acids/chemistry , Amino Acids/genetics , Archaeal Proteins/genetics , Crystallography, X-Ray , Fatty Acid Synthase, Type II/chemistry , Fatty Acid Synthase, Type II/genetics , Models, Molecular , Mutagenesis, Site-Directed , Protein Conformation , Protein Structure, Secondary , Pyrococcus horikoshii/genetics
9.
Mol Biol Evol ; 34(7): 1574-1586, 2017 07 01.
Article in English | MEDLINE | ID: mdl-28369657

ABSTRACT

Protein transport systems are fundamentally important for maintaining mitochondrial function. Nevertheless, mitochondrial protein translocases such as the kinetoplastid ATOM complex have recently been shown to vary in eukaryotic lineages. Various evolutionary hypotheses have been formulated to explain this diversity. To resolve any contradiction, estimating the primitive state and clarifying changes from that state are necessary. Here, we present more likely primitive models of mitochondrial translocases, specifically the translocase of the outer membrane (TOM) and translocase of the inner membrane (TIM) complexes, using scrutinized phylogenetic profiles. We then analyzed the translocases' evolution in eukaryotic lineages. Based on those results, we propose a novel evolutionary scenario for diversification of the mitochondrial transport system. Our results indicate that presequence transport machinery was mostly established in the last eukaryotic common ancestor, and that primitive translocases already had a pathway for transporting presequence-containing proteins. Moreover, secondary changes including convergent and migrational gains of a presequence receptor in TOM and TIM complexes, respectively, likely resulted from constrained evolution. The nature of a targeting signal can constrain alteration to the protein transport complex.


Subject(s)
Carrier Proteins/genetics , Mitochondria/genetics , Mitochondrial Membrane Transport Proteins/genetics , Biological Evolution , Biological Transport , Carrier Proteins/metabolism , Eukaryota/genetics , Eukaryota/metabolism , Eukaryotic Cells/metabolism , Evolution, Molecular , Membrane Transport Proteins/genetics , Membrane Transport Proteins/metabolism , Mitochondria/metabolism , Mitochondrial Membrane Transport Proteins/metabolism , Mitochondrial Precursor Protein Import Complex Proteins , Mitochondrial Proteins/metabolism , Phylogeny , Protein Transport/genetics , Sequence Analysis, Protein/methods
10.
Mol Biochem Parasitol ; 209(1-2): 10-17, 2016.
Article in English | MEDLINE | ID: mdl-26792249

ABSTRACT

Entamoeba histolytica, an anaerobic intestinal parasite causing dysentery and extra-intestinal abscesses in humans, possesses highly reduced and divergent mitochondrion-related organelles (MROs) called mitosomes. This organelle lacks many features associated with canonical aerobic mitochondria and even other MROs such as hydrogenosomes. The Entamoeba mitosome has been found to have a compartmentalized sulfate activation pathway, which was recently implicated to have a role in amebic stage conversion. It also features a unique shuttle system via Tom60, which delivers proteins from the cytosol to the mitosome. In addition, only Entamoeba mitosomes possess a novel subclass of ß-barrel outer membrane protein called MBOMP30. With the discoveries of such unique features of mitosomes of Entamoeba, there still remain a number of significant unanswered issues pertaining to this organelle. Particularly, the present understanding of the inner mitosomal membrane of Entamoeba is extremely limited. So far, only a few homologs for transporters of various substrates have been confirmed, while the components of the protein translocation complexes appear to be absent or are yet to be discovered. Employing a similar strategy as in our previous work, we collaborated to screen and discover mitosomal membrane proteins. Using a specialized prediction pipeline, we searched for proteins possessing α-helical transmembrane domains, which are unique to E. histolytica mitosomes. From the prediction algorithm, 25 proteins emerged as candidates, two of which were initially observed to be localized to the mitosomes. Further screening and analysis of the predicted proteins may provide clues to answer key questions on mitosomal evolution, biogenesis, dynamics, and biochemical processes.


Subject(s)
Entamoeba histolytica/metabolism , Membrane Proteins/metabolism , Mitochondrial Membranes/metabolism , Protozoan Proteins/metabolism , Amino Acid Sequence , Biological Evolution , Datasets as Topic , Humans , Membrane Proteins/chemistry , Mitochondria/metabolism , Mitochondrial Membranes/chemistry , Protein Transport , Protozoan Proteins/chemistry
11.
Science ; 349(6255): 1544-8, 2015 Sep 25.
Article in English | MEDLINE | ID: mdl-26404837

ABSTRACT

Mitochondria fulfill central functions in cellular energetics, metabolism, and signaling. The outer membrane translocator complex (the TOM complex) imports most mitochondrial proteins, but its architecture is unknown. Using a cross-linking approach, we mapped the active translocator down to single amino acid residues, revealing different transport paths for preproteins through the Tom40 channel. An N-terminal segment of Tom40 passes from the cytosol through the channel to recruit chaperones from the intermembrane space that guide the transfer of hydrophobic preproteins. The translocator contains three Tom40 ß-barrel channels sandwiched between a central α-helical Tom22 receptor cluster and external regulatory Tom proteins. The preprotein-translocating trimeric complex exchanges with a dimeric isoform to assemble new TOM complexes. Dynamic coupling of α-helical receptors, ß-barrel channels, and chaperones generates a versatile machinery that transports about 1000 different proteins.


Subject(s)
Mitochondrial Membrane Transport Proteins/chemistry , Saccharomyces cerevisiae Proteins/chemistry , Amino Acid Sequence , Cytosol/metabolism , Mitochondrial Membrane Transport Proteins/metabolism , Molecular Chaperones , Molecular Sequence Data , Protein Multimerization , Protein Structure, Secondary , Protein Transport , Saccharomyces cerevisiae Proteins/metabolism
12.
Mol Cell Proteomics ; 14(4): 1113-26, 2015 Apr.
Article in English | MEDLINE | ID: mdl-25670805

ABSTRACT

Mitochondria provide numerous essential functions for cells and their dysfunction leads to a variety of diseases. Thus, obtaining a complete mitochondrial proteome should be a crucial step toward understanding the roles of mitochondria. Many mitochondrial proteins have been identified experimentally but a complete list is not yet available. To fill this gap, methods to computationally predict mitochondrial proteins from amino acid sequence have been developed and are widely used, but unfortunately, their accuracy is far from perfect. Here we describe MitoFates, an improved prediction method for cleavable N-terminal mitochondrial targeting signals (presequences) and their cleavage sites. MitoFates introduces novel sequence features including positively charged amphiphilicity, presequence motifs, and position weight matrices modeling the presequence cleavage sites. These features are combined with classical ones such as amino acid composition and physico-chemical properties as input to a standard support vector machine classifier. On independent test data, MitoFates attains better performance than existing predictors in both detection of presequences and in predicting their cleavage sites. We used MitoFates to look for undiscovered mitochondrial proteins from 42,217 human proteins (including isoforms such as alternative splicing or translation initiation variants). MitoFates predicts 1167 genes to have at least one isoform with a presequence. Five-hundred and eighty of these genes were not annotated as mitochondrial in either UniProt or Gene Ontology. Interestingly, these include candidate regulators of parkin translocation to damaged mitochondria, and also many genes with known disease mutations, suggesting that careful investigation of MitoFates predictions may be helpful in elucidating the role of mitochondria in health and disease. MitoFates is open source with a convenient web server publicly available.


Subject(s)
Computational Biology/methods , Mitochondria/metabolism , Protein Sorting Signals , Amino Acid Motifs , Amino Acid Sequence , Area Under Curve , Cluster Analysis , Databases, Protein , Disease , Humans , Hydrophobic and Hydrophilic Interactions , Internet , Mitochondrial Membranes/metabolism , Mitochondrial Proteins/metabolism , Molecular Sequence Data , Protein Isoforms/metabolism , Proteome , ROC Curve , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/chemistry , Saccharomyces cerevisiae Proteins/metabolism
13.
BMC Genomics ; 15: 46, 2014 Jan 20.
Article in English | MEDLINE | ID: mdl-24438075

ABSTRACT

BACKGROUND: Protein subcellular localization is a central problem in understanding cell biology and has been the focus of intense research. In order to predict localization from amino acid sequence a myriad of features have been tried: including amino acid composition, sequence similarity, the presence of certain motifs or domains, and many others. Surprisingly, sequence conservation of sorting motifs has not yet been employed, despite its extensive use for tasks such as the prediction of transcription factor binding sites. RESULTS: Here, we flip the problem around, and present a proof of concept for the idea that the lack of sequence conservation can be a novel feature for localization prediction. We show that for yeast, mammal and plant datasets, evolutionary sequence divergence alone has significant power to identify sequences with N-terminal sorting sequences. Moreover sequence divergence is nearly as effective when computed on automatically defined ortholog sets as on hand curated ones. Unfortunately, sequence divergence did not necessarily increase classification performance when combined with some traditional sequence features such as amino acid composition. However a post-hoc analysis of the proteins in which sequence divergence changes the prediction yielded some proteins with atypical (i.e. not MPP-cleaved) matrix targeting signals as well as a few misannotations. CONCLUSION: We report the results of the first quantitative study of the effectiveness of evolutionary sequence divergence as a feature for protein subcellular localization prediction. We show that divergence is indeed useful for prediction, but it is not trivial to improve overall accuracy simply by adding this feature to classical sequence features. Nevertheless we argue that sequence divergence is a promising feature and show anecdotal examples in which it succeeds where other features fail.


Subject(s)
Genetic Variation , Plants/genetics , Protein Sorting Signals/genetics , Proteins/genetics , Saccharomyces cerevisiae/genetics , Algorithms , Amino Acid Sequence , Animals , Evolution, Molecular , Humans , Mitochondrial Proteins/genetics , Mitochondrial Proteins/metabolism , Molecular Sequence Data , Phylogeny , Plants/classification , Plants/metabolism , Proteins/chemistry , Proteins/metabolism , Saccharomyces cerevisiae/classification , Saccharomyces cerevisiae/metabolism , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL
...