Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Mol Biol Evol ; 33(8): 1921-36, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27189557

RESUMO

The gel-forming mucins are large glycosylated proteins that are essential components of the mucus layers covering epithelial cells. Using novel methods of identifying mucins based on profile hidden Markov models, we have found a large number of such proteins in Metazoa, aiding in their classification and allowing evolutionary studies. Most vertebrates have 5-6 gel-forming mucin genes and the genomic arrangement of these genes is well conserved throughout vertebrates. An exception is the frog Xenopus tropicalis with an expanded repertoire of at least 26 mucins of this type. Furthermore, we found that the ovomucin protein, originally identified in chicken, is characteristic of reptiles, birds, and amphibians. Muc6 is absent in teleost fish, but we now show that it is present in animals such as ghost sharks, demonstrating an early origin in vertebrate evolution. Public RNA-Seq data were analyzed with respect to mucins in zebrafish, frog, and chicken, thus allowing comparison in regard of tissue and developmental specificity. Analyses of invertebrate proteins reveal that gel-forming-mucin type of proteins is widely distributed also in this group. Their presence in Cnidaria, Porifera, and in Ctenophora (comb jellies) shows that these proteins were present early in metazoan evolution. Finally, we examined the evolution of the FCGBP protein, abundant in mucus and related to gel-forming mucins in terms of structure and localization. We demonstrate that FCGBP, ubiquitous in vertebrates, has a conserved N-terminal domain. Interestingly, this domain is also present as an N-terminal sequence in a number of bacterial proteins.


Assuntos
Moléculas de Adesão Celular/genética , Mucinas/genética , Sequência de Aminoácidos , Animais , Moléculas de Adesão Celular/química , Moléculas de Adesão Celular/metabolismo , Células Epiteliais/metabolismo , Evolução Molecular , Genoma/genética , Humanos , Cadeias de Markov , Mucina-6/química , Mucina-6/genética , Mucina-6/metabolismo , Mucinas/química , Mucinas/metabolismo , Muco , Ovomucina/química , Ovomucina/genética , Ovomucina/metabolismo , Filogenia , Análise de Sequência de RNA , Relação Estrutura-Atividade
2.
BMC Genomics ; 15: 238, 2014 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-24673733

RESUMO

BACKGROUND: The Amazonian rainforest is predicted to suffer from ongoing environmental changes. Despite the need to evaluate the impact of such changes on tree genetic diversity, we almost entirely lack genomic resources. RESULTS: In this study, we analysed the transcriptome of four tropical tree species (Carapa guianensis, Eperua falcata, Symphonia globulifera and Virola michelii) with contrasting ecological features, belonging to four widespread botanical families (respectively Meliaceae, Fabaceae, Clusiaceae and Myristicaceae). We sequenced cDNA libraries from three organs (leaves, stems, and roots) using 454 pyrosequencing. We have developed an R and bioperl-based bioinformatic procedure for de novo assembly, gene functional annotation and marker discovery. Mismatch identification takes into account single-base quality values as well as the likelihood of false variants as a function of contig depth and number of sequenced chromosomes. Between 17103 (for Symphonia globulifera) and 23390 (for Eperua falcata) contigs were assembled. Organs varied in the numbers of unigenes they apparently express, with higher number in roots. Patterns of gene expression were similar across species, with metabolism of aromatic compounds standing out as an overrepresented gene function. Transcripts corresponding to several gene functions were found to be over- or underrepresented in each organ. We identified between 4434 (for Symphonia globulifera) and 9076 (for Virola surinamensis) well-supported mismatches. The resulting overall mismatch density was comprised between 0.89 (S. globulifera) and 1.05 (V. surinamensis) mismatches/100 bp in variation-containing contigs. CONCLUSION: The relative representation of gene functions in the four transcriptomes suggests that secondary metabolism may be particularly important in tropical trees. The differential representation of transcripts among tissues suggests differential gene expression, which opens the way to functional studies in these non-model, ecologically important species. We found substantial amounts of mismatches in the four species. These newly identified putative variants are a first step towards acquiring much needed genomic resources for tropical tree species.


Assuntos
Genes de Plantas , Transcriptoma , Árvores/genética , Pareamento Incorreto de Bases , Clusiaceae/genética , Mapeamento de Sequências Contíguas , Fabaceae/genética , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Meliaceae/genética , Myristicaceae/genética , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA
3.
Sci Rep ; 12(1): 20652, 2022 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-36450890

RESUMO

Mucins are large glycoproteins that cover and protect epithelial surface of the body. Mucin domains of gel-forming mucins are rich in proline, threonine, and serine that are heavily glycosylated. These domains show great complexity with tandem repeats, thus make it difficult to study the sequences. With the coming of single molecule real-time (SMRT) sequencing technologies, we manage to present sequence structure of mucin domains via SMRT long reads for gel-forming mucins MUC2, MUC5AC, MUC5B and MUC6. Our study shows that for different individuals, single nucleotide polymorphisms could be found in mucin domains of MUC2, MUC5AC, MUC5B and MUC6, while different number of tandem repeats could be found in mucin domains of MUC2 and MUC6. Furthermore, we get the sequence of MUC2, MUC5AC, and MUC5B mucin domain in a Chinese individual for each nucleotide at accuracy of possibly 99.98-99.99%, 99.93-99.99%, and 99.76-99.99%, respectively. We report a new method to obtain DNA sequence of gel-forming mucin domains. This method will provided new insights on getting the sequence for Tandem Repeat parts which locate in coding region. With the sequences we obtained through this method, we can give more information for people to study the sequences of gel-forming mucin domains.


Assuntos
Sequências de Repetição em Tandem , Tecnologia , Humanos , Sequências de Repetição em Tandem/genética , Treonina , Povo Asiático , Nucleotídeos
4.
PLoS One ; 17(10): e0275671, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36256656

RESUMO

Human tissue surfaces are coated with mucins, a family of macromolecular sugar-laden proteins serving diverse functions from lubrication to the formation of selective biochemical barriers against harmful microorganisms and molecules. Membrane mucins are a distinct group of mucins that are attached to epithelial cell surfaces where they create a dense glycocalyx facing the extracellular environment. All mucin proteins carry long stretches of tandemly repeated sequences that undergo extensive O-linked glycosylation to form linear mucin domains. However, the repetitive nature of mucin domains makes them prone to recombination and renders their genetic sequences particularly difficult to read with standard sequencing technologies. As a result, human mucin genes suffer from significant sequence gaps that have hampered the investigation of gene function in health and disease. Here we leveraged a recent human genome assembly to characterize a previously unmapped MUC3B gene located at the q22 locus on chromosome 7, within a cluster of four structurally related membrane mucin genes that we name the MUC3 cluster. We found that MUC3B shares high sequence identity with the known MUC3A gene and that the two genes are governed by evolutionarily conserved regulatory elements. Furthermore, we show that MUC3A, MUC3B, MUC12, and MUC17 in the human MUC3 cluster are expressed in intestinal epithelial cells (IECs). Our results complete existing genetic gaps in the MUC3 cluster which is a conserved genetic unit in vertebrates. We anticipate our results to be the starting point for the detection of disease-associated polymorphisms in the human MUC3 cluster. Moreover, our study provides the basis for the exploration of intestinal mucin gene function in widely used experimental models such as human intestinal organoids and genetic mouse models.


Assuntos
Cromossomos Humanos Par 7 , Mucinas , Animais , Humanos , Camundongos , Sequência de Aminoácidos , Cromossomos Humanos Par 7/metabolismo , Mucosa Intestinal/metabolismo , Mucina-2/genética , Mucinas/metabolismo , Família Multigênica , Açúcares/metabolismo
5.
Sci Rep ; 8(1): 8906, 2018 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-29891987

RESUMO

Obtaining chloroplast (cp) genome sequence is necessary for studying physiological roles in plants. However, it is difficult to use traditional sequencing methods to get cp genome sequences because of the complex procedures of preparing templates. With the advent of next-generation sequencing technology, massive genome sequences can be produced. Thus, a good pipeline to assemble next-generation sequence reads with optimized k-mer length is essential to get whole cp genome sequences. Moreover, adjustment of other parameters is also very important, especially for the assembly of the cp genome. In this study, we developed a pipeline to generate the cp genome for Quercus spinosa. When Quercus rubra was used as a reference, we achieved coverage of 97.75% after optimizing k-mer length as well as other parameters. The efficiency of the pipeline makes it a useful method for cp genome construction in plants. It also provides great perspective on the analysis of cp genome characteristics and evolution.


Assuntos
Biologia Computacional/métodos , Genoma de Cloroplastos , Quercus/genética , Análise de Sequência de DNA/métodos , DNA de Cloroplastos/química , DNA de Cloroplastos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
6.
Sci Rep ; 8(1): 17503, 2018 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-30504806

RESUMO

The DNA sequence of the two human mucin genes MUC2 and MUC6 have not been completely resolved due to the repetitive nature of their central exon coding for Proline, Threonine and Serine rich sequences. The exact nucleotide sequence of these exons has remained unknown for a long time due to limitations in traditional sequencing techniques. These are still very poorly covered in new whole genome sequencing projects with the corresponding protein sequences partly missing. We used a BAC clone containing both these genes and third generation sequencing technology, SMRT sequencing, to obtain the full-length contiguous MUC2 and MUC6 tandem repeat sequences. The new sequences span the entire repeat regions with good coverage revealing their length, variation in repeat sequences and their internal organization. The sequences obtained were used to compare with available sequences from whole genome sequencing projects indicating variation in number of repeats and their internal organization between individuals. The lack of these sequences has limited the association of genetic alterations with disease. The full sequences of these mucins will now allow such studies, which could be of importance for inflammatory bowel diseases for MUC2 and gastric ulcer diseases for MUC6 where deficient mucus protection is assumed to play an important role.


Assuntos
Éxons , Mucina-2/genética , Mucina-6/genética , Sequências Repetitivas de Ácido Nucleico , Sequência de Aminoácidos , Cromossomos Artificiais Bacterianos , Doenças Inflamatórias Intestinais/genética , Mucina-2/química , Mucina-6/química , Polimorfismo Genético , Recombinação Genética , Úlcera Gástrica/genética
7.
BMC Genomics ; 7: 197, 2006 Aug 03.
Artigo em Inglês | MEDLINE | ID: mdl-16887038

RESUMO

BACKGROUND: Mucins are large glycoproteins that cover epithelial surfaces of the body. All mucins contain at least one PTS domain, a region rich in proline, threonine and serine. Mucins are also characterized by von Willebrand D (VWD) domains or SEA domains. We have developed computational methods to identify mucin genes and proteins based on these properties of the proteins. Using such methods we are able to characterize different organisms where genome sequence is available with respect to their mucin repertoire. RESULTS: We have here made a comprehensive analysis of potential mucins encoded by the chicken (Gallus gallus) genome. Three transmembrane mucins (Muc4, Muc13, and Muc16) and four gel-forming mucins (Muc6, Muc2, Muc5ac, and Muc5b) were identified. The gel-forming mucins are encoded within a locus similar to the corresponding human mucins. However, the chicken has an additional gene inserted between Muc2 and Muc5ac that encodes the the alpha-subunit of ovomucin, a protein similar to Muc2, but it is lacking a PTS domain. We also show that the beta-subunit of ovomucin is the orthologue of human MUC6. The transmembrane Muc13 gene is in chicken as well as in mammals adjacent to the HEG (heart of glass) gene. HEG has PTS, EGF and transmembrane domains like Muc13, suggesting that these two proteins are evolutionary related. Unlike previously known mucins, the PTS domain of Muc13 is encoded by multiple exons, where each exon encodes a repeat unit of the PTS domain. CONCLUSION: We report new mucin homologues in chicken and this information will aid in understanding the evolution of mucins in vertebrates. The fact that ovomucin, a protein not found in mammals, was located in the same locus as other gel-forming mucins provides strong support that these proteins are evolutionary related. Furthermore, a relationship of HEG and the transmembrane Muc13 is suggested on the basis of their biochemical properties and their presence in the same locus. Finally, our finding that the chicken Muc13 is distributed between multiple exons raises the interesting possibility that the length of the PTS domain could be controlled by alternative splicing.


Assuntos
Galinhas/genética , Éxons , Mucinas/genética , Ovomucina/genética , Estrutura Terciária de Proteína/genética , Sequência de Aminoácidos , Animais , Biologia Computacional/métodos , Evolução Molecular , Previsões , Genoma , Humanos , Glicoproteínas de Membrana/genética , Dados de Sequência Molecular , Mucina-5AC , Mucina-2 , Mucina-5B , Mucina-6 , Família Multigênica , Filogenia , Homologia de Sequência de Aminoácidos , Sequências de Repetição em Tandem , Vertebrados/genética , Peixe-Zebra/genética , Proteínas de Peixe-Zebra/genética
8.
Front Plant Sci ; 6: 882, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26557129

RESUMO

The KNOX (KNOTTED1-like homeobox) transcription factors play a pivotal role in leaf and meristem development. The majority of these proteins are characterized by the KNOX1, KNOX2, ELK, and homeobox domains whereas the proteins of the KNATM family contain only the KNOX domains. We carried out an extensive inventory of these proteins and here report on a total of 394 KNOX proteins from 48 species. The land plant proteins fall into two classes (I and II) as previously shown where the class I family seems to be most closely related to the green algae homologs. The KNATM proteins are restricted to Eudicots and some species have multiple paralogs of this protein. Certain plants are characterized by a significant increase in the number of KNOX paralogs; one example is Glycine max. Through the analysis of public gene expression data we show that the class II proteins of this plant have a relatively broad expression specificity as compared to class I proteins, consistent with previous studies of other plants. In G. max, class I protein are mainly distributed in axis tissues and KNATM paralogs are overall poorly expressed; highest expression is in the early plumular axis. Overall, analysis of gene expression in G. max demonstrates clearly that the expansion in gene number is associated with functional diversification.

9.
PLoS One ; 9(9): e108719, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25269070

RESUMO

Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.


Assuntos
Ficus/genética , Regulação da Expressão Gênica de Plantas , Genoma de Planta , Leucina/metabolismo , Proteínas de Plantas/genética , Sequências Repetitivas de Aminoácidos , Sequência de Aminoácidos , Ficus/química , Sequenciamento de Nucleotídeos em Larga Escala , Leucina/química , Anotação de Sequência Molecular , Dados de Sequência Molecular , Proteínas de Plantas/metabolismo , Estômatos de Plantas/fisiologia , Transpiração Vegetal/fisiologia , Estrutura Terciária de Proteína
10.
Proc Natl Acad Sci U S A ; 104(41): 16209-14, 2007 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-17911254

RESUMO

Mucins are proteins that cover and protect epithelial cells and are characterized by domains rich in proline, threonine, and serine that are heavily glycosylated (PTS or mucin domains). Because of their sequence polymorphism, these domains cannot be used for evolutionary analysis. Instead, we have made use of the von Willebrand D (VWD) and SEA domains, typical for mucins. A number of animal genomes were examined for these domains to identify mucin homologues, and domains of the resulting proteins were used in phylogenetic studies. The frog Xenopus tropicalis stands out because the number of gel-forming mucins has markedly increased to at least 25 as compared with 5 for higher animals. Furthermore, the frog Muc2 homologues contain unique PTS domains where cysteines are abundant. This animal also has a unique family of secreted mucin-like proteins with alternating PTS and SEA domains, a type of protein also identified in the fishes. The evolution of the Muc4 mucin seems to have occurred by recruitment of a PTS domain to AMOP, NIDO, and VWD domains from a sushi domain-containing family of proteins present in lower animals, and Xenopus is the most deeply branching animal where a protein similar to the mammalian Muc4 was identified. All transmembrane mucins seem to have appeared in the vertebrate lineage, and the MUC1 mucin is restricted to mammals. In contrast, proteins with properties of the gel-forming mucins were identified also in the starlet sea anemone Nematostella vectensis, demonstrating an early origin of this group of mucins.


Assuntos
Evolução Molecular , Mucinas/genética , Animais , Cordados não Vertebrados/genética , Géis , Humanos , Glicoproteínas de Membrana/química , Glicoproteínas de Membrana/genética , Mucinas/química , Filogenia , Estrutura Terciária de Proteína , Strongylocentrotus purpuratus/genética , Xenopus/genética , Proteínas de Xenopus/química , Proteínas de Xenopus/genética , Fator de von Willebrand/química , Fator de von Willebrand/genética
11.
Glycobiology ; 14(6): 521-7, 2004 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15044386

RESUMO

Mucins are large glycoproteins characterized by mucin domains that show little sequence conservation and are rich in the amino acids Ser, Thr, and Pro. To effectively predict mucins from genomic and protein sequences obtained from genome projects, we developed a strategy based on the amino acid compositional bias characteristic of the mucin domains. This strategy is combined with an analysis of other features commonly found in mucins. Our method has now been used to predict mucins in the puffer fish Fugu rubripes that were previously not identified or annotated. At least three gel-forming mucins were found with the same general domain structure as the human MUC2 mucin. In addition one transmembrane mucin was identified with SEA and EGF domains as found in the mammalian transmembrane mucins. These results suggest that the number of gel-forming mucins has been conserved during evolution of the vertebrates, whereas the family of transmembrane mucins has been markedly expanded in the higher vertebrates.


Assuntos
Biopolímeros/química , Biologia Computacional , Proteínas de Membrana/química , Mucinas/química , Animais , Takifugu
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA