Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
Hum Genet ; 140(12): 1651-1661, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34047840

ABSTRACT

Puberty is a complex developmental process that varies considerably among individuals and populations. Genetic factors explain a large proportion of the variability of several pubertal traits. Recent genome-wide association studies (GWAS) have identified hundreds of variants involved in traits that result from body growth, like adult height. However, they do not capture many genetic loci involved in growth changes over distinct growth phases. Further, such GWAS have been mostly performed in Europeans, but it is unknown how these findings relate to other continental populations. In this study, we analyzed the genetic basis of three pubertal traits; namely, peak height velocity (PV), age at PV (APV) and height at APV (HAPV). We analyzed a cohort of 904 admixed Chilean children and adolescents with European and Mapuche Native American ancestries. Height was measured on roughly a [Formula: see text]month basis from childhood to adolescence between 2006 and 2019. We predict that, in average, HAPV is 4.3 cm higher in European than in Mapuche adolescents (P = 0.042), and APV is 0.73 years later in European compared with Mapuche adolescents (P = 0.023). Further, by performing a GWAS on 774, 433 single-nucleotide polymorphisms, we identified a genetic signal harboring 3 linked variants significantly associated with PV in boys (P [Formula: see text]). This signal has never been associated with growth-related traits.


Subject(s)
Indians, South American/genetics , Puberty/genetics , Adolescent , Adolescent Development , Adult , Aging/genetics , Body Height/genetics , Chile , Cohort Studies , Female , Genetic Variation , Genome-Wide Association Study , Humans , Male , White People/genetics
3.
Bioinformatics ; 29(20): 2649-50, 2013 Oct 15.
Article in English | MEDLINE | ID: mdl-23929030

ABSTRACT

SUMMARY: The understanding of the biological role of RNA molecules has changed. Although it is widely accepted that RNAs play important regulatory roles without necessarily coding for proteins, the functions of many of these non-coding RNAs are unknown. Thus, determining or modeling the 3D structure of RNA molecules as well as assessing their accuracy and stability has become of great importance for characterizing their functional activity. Here, we introduce a new web application, WebRASP, that uses knowledge-based potentials for scoring RNA structures based on distance-dependent pairwise atomic interactions. This web server allows the users to upload a structure in PDB format, select several options to visualize the structure and calculate the energy profile. The server contains online help, tutorials and links to other related resources. We believe this server will be a useful tool for predicting and assessing the quality of RNA 3D structures. AVAILABILITY AND IMPLEMENTATION: The web server is available at http://melolab.org/webrasp. It has been tested on the most popular web browsers and requires Java plugin for Jmol visualization.


Subject(s)
Nucleic Acid Conformation , RNA Stability , RNA/chemistry , Internet , Models, Molecular , Software
4.
iScience ; 26(2): 106091, 2023 Feb 17.
Article in English | MEDLINE | ID: mdl-36844456

ABSTRACT

Body-mass index (BMI) is a hallmark of adiposity. In contrast with adulthood, the genetic architecture of BMI during childhood is poorly understood. The few genome-wide association studies (GWAS) on children have been performed almost exclusively in Europeans and at single ages. We performed cross-sectional and longitudinal GWAS for BMI-related traits on 904 admixed children with mostly Mapuche Native American and European ancestries. We found regulatory variants of the immune gene HLA-DQB3 strongly associated with BMI at 1.5 - 2.5 years old. A variant in the sex-determining gene DMRT1 was associated with the age at adiposity rebound (Age-AR) in girls (P = 9.8 × 10 - 9 ). BMI was significantly higher in Mapuche than in Europeans between 5.5 and 16.5 years old. Finally, Age-AR was significantly lower (P = 0.004 ) by 1.94 years and BMI at AR was significantly higher (P = 0.04 ) by 1.2 kg/ m 2 , in Mapuche children compared with Europeans.

5.
Bioinformatics ; 27(8): 1086-93, 2011 Apr 15.
Article in English | MEDLINE | ID: mdl-21349865

ABSTRACT

MOTIVATION: Over the recent years, the vision that RNA simply serves as information transfer molecule has dramatically changed. The study of the sequence/structure/function relationships in RNA is becoming more important. As a direct consequence, the total number of experimentally solved RNA structures has dramatically increased and new computer tools for predicting RNA structure from sequence are rapidly emerging. Therefore, new and accurate methods for assessing the accuracy of RNA structure models are clearly needed. RESULTS: Here, we introduce an all-atom knowledge-based potential for the assessment of RNA three-dimensional (3D) structures. We have benchmarked our new potential, called Ribonucleic Acids Statistical Potential (RASP), with two different decoy datasets composed of near-native RNA structures. In one of the benchmark sets, RASP was able to rank the closest model to the X-ray structure as the best and within the top 10 models for ∼93 and ∼95% of decoys, respectively. The average correlation coefficient between model accuracy, calculated as the root mean square deviation and global distance test-total score (GDT-TS) measures of C3' atoms, and the RASP score was 0.85 and 0.89, respectively. Based on a recently released benchmark dataset that contains hundreds of 3D models for 32 RNA motifs with non-canonical base pairs, RASP scoring function compared favorably to ROSETTA FARFAR force field in the selection of accurate models. Finally, using the self-splicing group I intron and the stem-loop IIIc from hepatitis C virus internal ribosome entry site as test cases, we show that RASP is able to discriminate between known structure-destabilizing mutations and compensatory mutations. AVAILABILITY: RASP can be readily applied to assess all-atom or coarse-grained RNA structures and thus should be of interest to both developers and end-users of RNA structure prediction methods. The computer software and knowledge-based potentials are freely available at http://melolab.org/supmat.html. CONTACT: fmelo@bio.puc.cl; mmarti@cipf.es SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
RNA/chemistry , Software , Data Interpretation, Statistical , Knowledge Bases , Models, Molecular , Nucleic Acid Conformation
6.
J Biomed Biotechnol ; 2012: 103132, 2012.
Article in English | MEDLINE | ID: mdl-22505803

ABSTRACT

Currently, about 20 crystal structures per day are released and deposited in the Protein Data Bank. A significant fraction of these structures is produced by research groups associated with the structural genomics consortium. The biological function of many of these proteins is generally unknown or not validated by experiment. Therefore, a growing need for functional prediction of protein structures has emerged. Here we present an integrated bioinformatics method that combines sequence-based relationships and three-dimensional (3D) structural similarity of transcriptional regulators with computer prediction of their cognate DNA binding sequences. We applied this method to the AraC/XylS family of transcription factors, which is a large family of transcriptional regulators found in many bacteria controlling the expression of genes involved in diverse biological functions. Three putative new members of this family with known 3D structure but unknown function were identified for which a probable functional classification is provided. Our bioinformatics analyses suggest that they could be involved in plant cell wall degradation (Lin2118 protein from Listeria innocua, PDB code 3oou), symbiotic nitrogen fixation (protein from Chromobacterium violaceum, PDB code 3oio), and either metabolism of plant-derived biomass or nitrogen fixation (protein from Rhodopseudomonas palustris, PDB code 3mn2).


Subject(s)
AraC Transcription Factor/classification , Computational Biology/methods , Molecular Sequence Annotation/methods , Transcription Factors/classification , Amino Acid Sequence , AraC Transcription Factor/chemistry , Binding Sites , Cluster Analysis , Databases, Protein , Models, Molecular , Models, Statistical , Molecular Sequence Data , Sequence Alignment , Transcription Factors/chemistry
7.
Front Microbiol ; 13: 967021, 2022.
Article in English | MEDLINE | ID: mdl-36338106

ABSTRACT

High-throughput sequencing (HTS) methods are transforming our capacity to detect pathogens and perform disease diagnosis. Although sequencing advances have enabled accessible and point-of-care HTS, data analysis pipelines have yet to provide robust tools for precise and certain diagnosis, particularly in cases of low sequencing coverage. Lack of standardized metrics and harmonized detection thresholds confound the problem further, impeding the adoption and implementation of these solutions in real-world applications. In this work, we tackle these issues and propose biologically-informed viral genome assembly coverage as a method to improve diagnostic certainty. We use the identification of viral replicases, an essential function of viral life cycles, to define genome coverage thresholds in which biological functions can be described. We validate the analysis pipeline, Viroscope, using field samples, synthetic and published datasets, and demonstrate that it provides sensitive and specific viral detection. Furthermore, we developed Viroscope.io a web-service to provide on-demand HTS data viral diagnosis to facilitate adoption and implementation by phytosanitary agencies to enable precise viral diagnosis.

8.
BMC Bioinformatics ; 11: 262, 2010 May 18.
Article in English | MEDLINE | ID: mdl-20482798

ABSTRACT

The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 A or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface.We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes.


Subject(s)
DNA-Binding Proteins/chemistry , DNA/chemistry , Databases, Nucleic Acid , Databases, Protein , Binding Sites , Crystallography, X-Ray , DNA/classification , DNA-Binding Proteins/classification
9.
J Mol Biol ; 432(7): 2428-2443, 2020 03 27.
Article in English | MEDLINE | ID: mdl-32142788

ABSTRACT

The intricate details of how proteins bind to proteins, DNA, and RNA are crucial for the understanding of almost all biological processes. Disease-causing sequence variants often affect binding residues. Here, we described a new, comprehensive system of in silico methods that take only protein sequence as input to predict binding of protein to DNA, RNA, and other proteins. Firstly, we needed to develop several new methods to predict whether or not proteins bind (per-protein prediction). Secondly, we developed independent methods that predict which residues bind (per-residue). Not requiring three-dimensional information, the system can predict the actual binding residue. The system combined homology-based inference with machine learning and motif-based profile-kernel approaches with word-based (ProtVec) solutions to machine learning protein level predictions. This achieved an overall non-exclusive three-state accuracy of 77% ± 1% (±one standard error) corresponding to a 1.8 fold improvement over random (best classification for protein-protein with F1 = 91 ± 0.8%). Standard neural networks for per-residue binding residue predictions appeared best for DNA-binding (Q2 = 81 ± 0.9%) followed by RNA-binding (Q2 = 80 ± 1%) and worst for protein-protein binding (Q2 = 69 ± 0.8%). The new method, dubbed ProNA2020, is available as code through github (https://github.com/Rostlab/ProNA2020.git) and through PredictProtein (www.predictprotein.org).


Subject(s)
Computational Biology/methods , DNA/metabolism , Neural Networks, Computer , Proteins/metabolism , RNA/metabolism , Sequence Analysis, Protein/methods , Software , Animals , Binding Sites , DNA/chemistry , Eukaryota/metabolism , Humans , Machine Learning , Nucleic Acid Conformation , Prokaryotic Cells/metabolism , Protein Binding , Protein Conformation , Proteins/chemistry , RNA/chemistry
10.
Genome Biol Evol ; 12(8): 1459-1470, 2020 08 01.
Article in English | MEDLINE | ID: mdl-32614437

ABSTRACT

Detection of positive selection signatures in populations around the world is helping to uncover recent human evolutionary history as well as the genetic basis of diseases. Most human evolutionary genomic studies have been performed in European, African, and Asian populations. However, populations with Native American ancestry have been largely underrepresented. Here, we used a genome-wide local ancestry enrichment approach complemented with neutral simulations to identify postadmixture adaptations underwent by admixed Chileans through gene flow from Europeans into local Native Americans. The top significant hits (P = 2.4×10-7) are variants in a region on chromosome 12 comprising multiple regulatory elements. This region includes rs12821256, which regulates the expression of KITLG, a well-known gene involved in lighter hair and skin pigmentation in Europeans as well as in thermogenesis. Another variant from that region is associated with the long noncoding RNA RP11-13A1.1, which has been specifically involved in the innate immune response against infectious pathogens. Our results suggest that these genes were relevant for adaptation in Chileans following the Columbian exchange.


Subject(s)
Adaptation, Biological/genetics , Chromosomes, Human, Pair 12 , Genome, Human , Pigmentation/genetics , Selection, Genetic , Chile , Female , Gene Flow , Haplotypes , Humans , Hybridization, Genetic , Indians, South American/genetics , Male , Thermogenesis/genetics , White People/genetics
11.
Nucleic Acids Res ; 35(Web Server issue): W163-8, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17626053

ABSTRACT

We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.


Subject(s)
Computational Biology/methods , Gene Expression Regulation, Fungal , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/genetics , Sequence Tagged Sites , Software , Chromosome Mapping , DNA, Complementary/genetics , Databases, Genetic , Expressed Sequence Tags , Internet , RNA, Fungal/genetics , RNA, Messenger/genetics , RNA, Untranslated/genetics
12.
BMC Bioinformatics ; 9: 265, 2008 Jun 05.
Article in English | MEDLINE | ID: mdl-18534022

ABSTRACT

BACKGROUND: As in many different areas of science and technology, most important problems in bioinformatics rely on the proper development and assessment of binary classifiers. A generalized assessment of the performance of binary classifiers is typically carried out through the analysis of their receiver operating characteristic (ROC) curves. The area under the ROC curve (AUC) constitutes a popular indicator of the performance of a binary classifier. However, the assessment of the statistical significance of the difference between any two classifiers based on this measure is not a straightforward task, since not many freely available tools exist. Most existing software is either not free, difficult to use or not easy to automate when a comparative assessment of the performance of many binary classifiers is intended. This constitutes the typical scenario for the optimization of parameters when developing new classifiers and also for their performance validation through the comparison to previous art. RESULTS: In this work we describe and release new software to assess the statistical significance of the observed difference between the AUCs of any two classifiers for a common task estimated from paired data or unpaired balanced data. The software is able to perform a pairwise comparison of many classifiers in a single run, without requiring any expert or advanced knowledge to use it. The software relies on a non-parametric test for the difference of the AUCs that accounts for the correlation of the ROC curves. The results are displayed graphically and can be easily customized by the user. A human-readable report is generated and the complete data resulting from the analysis are also available for download, which can be used for further analysis with other software. The software is released as a web server that can be used in any client platform and also as a standalone application for the Linux operating system. CONCLUSION: A new software for the statistical comparison of ROC curves is released here as a web server and also as standalone software for the LINUX operating system.


Subject(s)
Algorithms , Data Interpretation, Statistical , Diagnosis, Computer-Assisted/methods , ROC Curve , Software
13.
Nucleic Acids Res ; 33(Web Server issue): W570-2, 2005 Jul 01.
Article in English | MEDLINE | ID: mdl-15980538

ABSTRACT

An accurate and robust large-scale melting temperature prediction server for short DNA sequences is dispatched. The server calculates a consensus melting temperature value using the nearest-neighbor model based on three independent thermodynamic data tables. The consensus method gives an accurate prediction of melting temperature, as it has been recently demonstrated in a benchmark performed using all available experimental data for DNA sequences within the length range of 16-30 nt. This constitutes the first web server that has been implemented to perform a large-scale calculation of melting temperatures in real time (up to 5000 DNA sequences can be submitted in a single run). The expected accuracy of calculations carried out by this server in the range of 50-600 mM monovalent salt concentration is that 89% of the melting temperature predictions will have an error or deviation of <5 degrees C from experimental data. The server can be freely accessed at http://dna.bio.puc.cl/tm.html. The standalone executable versions of this software for LINUX, Macintosh and Windows platforms are also freely available at the same web site. Detailed further information supporting this server is available at the same web site referenced above.


Subject(s)
Nucleic Acid Hybridization , Sequence Analysis, DNA/methods , Software , Temperature , DNA/chemistry , Internet , Nucleic Acid Denaturation , User-Computer Interface
14.
PLoS One ; 6(7): e22569, 2011.
Article in English | MEDLINE | ID: mdl-21818339

ABSTRACT

Transposable elements comprise a large proportion of animal genomes. Transposons can have detrimental effects on genome stability but also offer positive roles for genome evolution and gene expression regulation. Proper balance of the positive and deleterious effects of transposons is crucial for cell homeostasis and requires a mechanism that tightly regulates their expression. Herein we describe the expression of DNA transposons of the Tc1/mariner superfamily during Xenopus development. Sense and antisense transcripts containing complete Tc1-2_Xt were detected in Xenopus embryos. Both transcripts were found in zygotic stages and were mainly localized in Spemann's organizer and neural tissues. In addition, the Tc1-like elements Eagle, Froggy, Jumpy, Maya, Xeminos and TXr were also expressed in zygotic stages but not oocytes in X. tropicalis. Interestingly, although Tc1-2_Xt transcripts were not detected in Xenopus laevis embryos, transcripts from other two Tc1-like elements (TXr and TXz) presented a similar temporal and spatial pattern during X. laevis development. Deep sequencing analysis of Xenopus tropicalis gastrulae showed that PIWI-interacting RNAs (piRNAs) are specifically derived from several Tc1-like elements. The localized expression of Tc1-like elements in neural tissues suggests that they could play a role during the development of the Xenopus nervous system.


Subject(s)
DNA Transposable Elements/genetics , Gene Expression Regulation, Developmental , Nervous System/embryology , Nervous System/metabolism , Xenopus/embryology , Xenopus/genetics , Animals , Genome/genetics , RNA, Small Interfering/metabolism , Zygote/metabolism
15.
Science ; 326(5957): 1235-40, 2009 Nov 27.
Article in English | MEDLINE | ID: mdl-19965468

ABSTRACT

The genome of Mycoplasma pneumoniae is among the smallest found in self-replicating organisms. To study the basic principles of bacterial proteome organization, we used tandem affinity purification-mass spectrometry (TAP-MS) in a proteome-wide screen. The analysis revealed 62 homomultimeric and 116 heteromultimeric soluble protein complexes, of which the majority are novel. About a third of the heteromultimeric complexes show higher levels of proteome organization, including assembly into larger, multiprotein complex entities, suggesting sequential steps in biological processes, and extensive sharing of components, implying protein multifunctionality. Incorporation of structural models for 484 proteins, single-particle electron microscopy, and cellular electron tomograms provided supporting structural details for this proteome organization. The data set provides a blueprint of the minimal cellular machinery required for life.


Subject(s)
Bacterial Proteins/analysis , Genome, Bacterial , Multiprotein Complexes/analysis , Mycoplasma pneumoniae/chemistry , Mycoplasma pneumoniae/genetics , Proteome , Bacterial Proteins/isolation & purification , Bacterial Proteins/metabolism , Computational Biology , Mass Spectrometry/methods , Metabolic Networks and Pathways , Microscopy, Electron , Models, Biological , Models, Molecular , Multiprotein Complexes/metabolism , Mycoplasma pneumoniae/metabolism , Mycoplasma pneumoniae/ultrastructure , Pattern Recognition, Automated , Protein Interaction Mapping , Systems Biology
SELECTION OF CITATIONS
SEARCH DETAIL