RESUMEN
Balanced chromosomal rearrangements (BCR) are associated with abnormal phenotypes in approximately 6% of balanced translocations and 9.4% of balanced inversions. Abnormal phenotypes can be caused by disruption of genes at the breakpoints, deletions, or positional effects. Conventional cytogenetic techniques have a limited resolution and do not enable a thorough genetic investigation. Molecular techniques applied to BCR carriers can contribute to the characterization of this type of chromosomal rearrangement and to the phenotype-genotype correlation. Fifteen individuals among 35 with abnormal phenotypes and BCR were selected for further investigation by molecular techniques. Chromosomal rearrangements involved 11 reciprocal translocations, 3 inversions, and 1 balanced insertion. Array genomic hybridization (AGH) was performed and genomic imbalances were detected in 20% of the cases, 1 at a rearrangement breakpoint and 2 further breakpoints in other chromosomes. Alterations were further confirmed by FISH and associated with the phenotype of the carriers. In the analyzed cases not showing genomic imbalances by AGH, next-generation sequencing (NGS), using whole genome libraries, prepared following the Illumina TruSeq DNA PCR-Free protocol (Illumina®) and then sequenced on an Illumina HiSEQ 2000 as 150-bp paired-end reads, was done. The NGS results suggested breakpoints in 7 cases that were similar or near those estimated by karyotyping. The genes overlapping 6 breakpoint regions were analyzed. Follow-up of BCR carriers would improve the knowledge about these chromosomal rearrangements and their consequences.
RESUMEN
MOTIVATION: Current computational approaches to function prediction are mostly based on protein sequence classification and transfer of annotation from known proteins to their closest homologous sequences relying on the orthology concept of function conservation. This approach suffers a major weakness: annotation reliability depends on global sequence similarity to known proteins and is poorly efficient for enzyme superfamilies that catalyze different reactions. Structural biology offers a different strategy to overcome the problem of annotation by adding information about protein 3D structures. This information can be used to identify amino acids located in active sites, focusing on detection of functional polymorphisms residues in an enzyme superfamily. Structural genomics programs are providing more and more novel protein structures at a high-throughput rate. However, there is still a huge gap between the number of sequences and available structures. Computational methods, such as homology modeling provides reliable approaches to bridge this gap and could be a new precise tool to annotate protein functions. RESULTS: Here, we present Active Sites Modeling and Clustering (ASMC) method, a novel unsupervised method to classify sequences using structural information of protein pockets. ASMC combines homology modeling of family members, structural alignment of modeled active sites and a subsequent hierarchical conceptual classification. Comparison of profiles obtained from computed clusters allows the identification of residues correlated to subfamily function divergence, called specificity determining positions. ASMC method has been validated on a benchmark of 42 Pfam families for which previous resolved holo-structures were available. ASMC was also applied to several families containing known protein structures and comprehensive functional annotations. We will discuss how ASMC improves annotation and understanding of protein families functions by giving some specific illustrative examples on nucleotidyl cyclases, protein kinases and serine proteases. AVAILABILITY: http://www.genoscope.fr/ASMC/.
Asunto(s)
Proteínas/clasificación , Análisis de Secuencia de Proteína/métodos , Dominio Catalítico , Análisis por Conglomerados , Biología Computacional/métodos , Enzimas/clasificación , Modelos Biológicos , Anotación de Secuencia Molecular , Liasas de Fósforo-Oxígeno/química , Proteínas Quinasas/química , Proteínas/química , Proteínas/metabolismo , Alineación de Secuencia , Serina Proteasas/químicaRESUMEN
Single nucleotide polymorphism (SNP) markers have been shown to be useful in genetic investigations of medically important parasites and their hosts. In this paper, we describe the prediction and validation of SNPs in ESTs of Schistosoma mansoni. We used 107,417 public sequences of S. mansoni and identified 15,614 high-quality candidate SNPs in 12,184 contigs. The presence of predicted SNPs was observed in well characterized antigens and vaccine candidates such as those coding for myosin; Sm14 and Sm23; cathepsin B and triosephosphate isomerase (TPI). Additionally, SNPs were experimentally validated for the cathepsin B. A comparative model of the S. mansoni cathepsin B was built for predicting the possible consequences of amino acid substitutions on the protein structure. An analysis of the substitutions indicated that the amino acids were mostly located on the surface of the molecule, and we found no evidence for a significant conformational change of the enzyme. However, at least one of the substitutions could result in a structural modification of an epitope.
Asunto(s)
Schistosoma mansoni/genética , Animales , Antígenos Helmínticos/genética , Catepsina B/química , Catepsina B/genética , Proteínas de Transporte de Ácidos Grasos/genética , Genes de Helminto/genética , Proteínas del Helminto/genética , Modelos Moleculares , Miosinas/genética , Polimorfismo de Nucleótido Simple , Triosa-Fosfato Isomerasa/genéticaRESUMEN
The Tropical Biominer Project is a recent initiative from the Federal University of Minas Gerais (UFMG) and the Oswaldo Cruz foundation, with the participation of the Biominas Foundation (Belo Horizonte, Minas Gerais, Brazil) and the start-up Homologix. The main objective of the project is to build a new resource for the chemogenomics research, on chemical compounds, with a strong emphasis on natural molecules. Adopted technologies include the search of information from structured, semi-structured, and non-structured documents (the last two from the web) and datamining tools in order to gather information from different sources. The database is the support for developing applications to find new potential treatments for parasitic infections by using virtual screening tools. We present here the midpoint of the project: the conception and implementation of the Tropical Biominer Database. This is a Federated Database designed to store data from different resources. Connected to the database, a web crawler is able to gather information from distinct, patented web sites and store them after automatic classification using datamining tools. Finally, we demonstrate the interest of the approach, by formulating new hypotheses on specific targets of a natural compound, violacein, using inferences from a Virtual Screening procedure.