Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Orphanet J Rare Dis ; 15(1): 191, 2020 07 22.
Article in English | MEDLINE | ID: mdl-32698834

ABSTRACT

BACKGROUND: In diagnosis of rare genetic diseases we face a decision as to the degree to which the sequencing lab offers one or more diagnoses based on clinical input provided by the clinician, or the clinician reaches a diagnosis based on the complete set of variants provided by the lab. We tested a software approach to assist the clinician in making the diagnosis based on clinical findings and an annotated genomic variant table, using cases already solved using less automated processes. RESULTS: For the 81 cases studied (involving 216 individuals), 70 had genetic abnormalities with phenotypes previously described in the literature, and 11 were not described in the literature at the time of analysis ("discovery genes"). These included cases beyond a trio, including ones with different variants in the same gene. In 100% of cases the abnormality was recognized. Of the 70, the abnormality was ranked #1 in 94% of cases, with an average rank 1.1 for all cases. Large CNVs could be analyzed in an integrated analysis, performed in 24 of the cases. The process is rapid enough to allow for periodic reanalysis of unsolved cases. CONCLUSIONS: A clinician-friendly environment for clinical correlation can be provided to clinicians who are best positioned to have the clinical information needed for this interpretation.


Subject(s)
Rare Diseases , Software , DNA Copy Number Variations , Genomics , Humans , Phenotype , Rare Diseases/diagnosis , Rare Diseases/genetics
2.
PLoS One ; 11(7): e0155839, 2016.
Article in English | MEDLINE | ID: mdl-27434306

ABSTRACT

Micromonas is a unicellular motile alga within the Prasinophyceae, a green algal group that is related to land plants. This picoeukaryote (<2 µm diameter) is widespread in the marine environment but is not well understood at the cellular level. Here, we examine shifts in mRNA and protein expression over the course of the day-night cycle using triplicated mid-exponential, nutrient replete cultures of Micromonas pusilla CCMP1545. Samples were collected at key transition points during the diel cycle for evaluation using high-throughput LC-MS proteomics. In conjunction, matched mRNA samples from the same time points were sequenced using pair-ended directional Illumina RNA-Seq to investigate the dynamics and relationship between the mRNA and protein expression programs of M. pusilla. Similar to a prior study of the marine cyanobacterium Prochlorococcus, we found significant divergence in the mRNA and proteomics expression dynamics in response to the light:dark cycle. Additionally, expressional responses of genes and the proteins they encoded could also be variable within the same metabolic pathway, such as we observed in the oxygenic photosynthesis pathway. A regression framework was used to predict protein levels from both mRNA expression and gene-specific sequence-based features. Several features in the genome sequence were found to influence protein abundance including codon usage as well as 3' UTR length and structure. Collectively, our studies provide insights into the regulation of the proteome over a diel cycle as well as the relationships between transcriptional and translational programs in the widespread marine green alga Micromonas.


Subject(s)
Algal Proteins/genetics , Chlorophyta/genetics , Gene Expression Regulation, Plant , Proteomics , RNA, Algal/genetics , RNA, Messenger/genetics , 3' Untranslated Regions , Algal Proteins/metabolism , Chlorophyta/metabolism , Codon , Gene Ontology , Molecular Sequence Annotation , Photoperiod , Photosynthesis/genetics , Protein Biosynthesis , RNA, Algal/metabolism , RNA, Messenger/metabolism , Sequence Analysis, RNA , Transcription, Genetic
3.
PLoS Comput Biol ; 7(12): e1002228, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22144874

ABSTRACT

The increasing abundance of large-scale, high-throughput datasets for many closely related organisms provides opportunities for comparative analysis via the simultaneous biclustering of datasets from multiple species. These analyses require a reformulation of how to organize multi-species datasets and visualize comparative genomics data analyses results. Recently, we developed a method, multi-species cMonkey, which integrates heterogeneous high-throughput datatypes from multiple species to identify conserved regulatory modules. Here we present an integrated data visualization system, built upon the Gaggle, enabling exploration of our method's results (available at http://meatwad.bio.nyu.edu/cmmr.html). The system can also be used to explore other comparative genomics datasets and outputs from other data analysis procedures - results from other multiple-species clustering programs or from independent clustering of different single-species datasets. We provide an example use of our system for two bacteria, Escherichia coli and Salmonella Typhimurium. We illustrate the use of our system by exploring conserved biclusters involved in nitrogen metabolism, uncovering a putative function for yjjI, a currently uncharacterized gene that we predict to be involved in nitrogen assimilation.


Subject(s)
Algorithms , Computational Biology/methods , Databases, Factual , Genome, Bacterial , Software , Cluster Analysis , Escherichia coli/genetics , Escherichia coli/metabolism , Escherichia coli/physiology , Nitrogen/metabolism , Salmonella typhimurium/genetics , Salmonella typhimurium/metabolism , Salmonella typhimurium/physiology , Systems Biology , User-Computer Interface
4.
Genome Biol ; 11(9): R96, 2010.
Article in English | MEDLINE | ID: mdl-20920250

ABSTRACT

We describe an algorithm, multi-species cMonkey, for the simultaneous biclustering of heterogeneous multiple-species data collections and apply the algorithm to a group of bacteria containing Bacillus subtilis, Bacillus anthracis, and Listeria monocytogenes. The algorithm reveals evolutionary insights into the surprisingly high degree of conservation of regulatory modules across these three species and allows data and insights from well-studied organisms to complement the analysis of related but less well studied organisms.


Subject(s)
Algorithms , Bacillus/genetics , Cluster Analysis , Data Mining , Genomics , Listeria monocytogenes/genetics , Multigene Family , Bacillus anthracis/genetics , Bacillus subtilis/genetics , Base Sequence , Computational Biology/methods , Gene Expression Profiling/methods , Gene Expression Regulation, Bacterial , Gene Regulatory Networks , Genome, Bacterial , Models, Genetic , Pattern Recognition, Automated
5.
Proteins ; 66(1): 127-35, 2007 Jan 01.
Article in English | MEDLINE | ID: mdl-17039548

ABSTRACT

Fibrous proteins such as collagen, silk, and elastin play critical biological roles, yet they have been the subject of few projects that use computational techniques to predict either their class or their structure. In this article, we present FiberID, a simple yet effective method for identifying and distinguishing three fibrous protein subclasses from their primary sequences. Using a combination of amino acid composition and fast Fourier measurements, FiberID can classify fibrous proteins belonging to these subclasses with high accuracy by using two standard machine learning techniques (decision trees and Naïve Bayesian classifiers). After presenting our results, we present several fibrous sequences that are regularly misclassified by FiberID as sequences of potential interest for further study. Finally, we analyze the decision trees developed by FiberID for potential insights regarding the structure of these proteins.


Subject(s)
Algorithms , Collagen/classification , Computational Biology/methods , Elastin/classification , Silk/classification , Software , Amino Acid Sequence , Animals , Bayes Theorem , Collagen/chemistry , Databases, Protein , Elastin/chemistry , Humans , Molecular Sequence Data , Silk/chemistry
6.
Pharmacogenomics ; 7(3): 503-9, 2006 Apr.
Article in English | MEDLINE | ID: mdl-16610960

ABSTRACT

Comprehensive, systematic and integrated data-centric statistical approaches to disease modeling can provide powerful frameworks for understanding disease etiology. Here, one such computational framework based on redescription mining in both its incarnations, static and dynamic, is discussed. The static framework provides bioinformatic tools applicable to multifaceted datasets, containing genetic, transcriptomic, proteomic, and clinical data for diseased patients and normal subjects. The dynamic redescription framework provides systems biology tools to model complex sets of regulatory, metabolic and signaling pathways in the initiation and progression of a disease. As an example, the case of chronic fatigue syndrome (CFS) is considered, which has so far remained intractable and unpredictable in its etiology and nosology. The redescription mining approaches can be applied to the Centers for Disease Control and Prevention's Wichita (KS, USA) dataset, integrating transcriptomic, epidemiological and clinical data, and can also be used to study how pathways in the hypothalamic-pituitary-adrenal axis affect CFS patients.


Subject(s)
Data Interpretation, Statistical , Databases, Factual , Algorithms , Fatigue Syndrome, Chronic/epidemiology , Fatigue Syndrome, Chronic/genetics , Humans , Models, Statistical
SELECTION OF CITATIONS
SEARCH DETAIL
...