RESUMO
A 5',7-methylguanosine cap is a quintessential feature of RNA polymerase II-transcribed RNAs, and a textbook aspect of co-transcriptional RNA processing. The cap is bound by the cap-binding complex (CBC), canonically consisting of nuclear cap-binding proteins 1 and 2 (NCBP1/2). Interest in the CBC has recently renewed due to its participation in RNA-fate decisions via interactions with RNA productive factors as well as with adapters of the degradative RNA exosome. A novel cap-binding protein, NCBP3, was recently proposed to form an alternative CBC together with NCBP1, and to interact with the canonical CBC along with the protein SRRT. The theme of post-transcriptional RNA fate, and how it relates to co-transcriptional ribonucleoprotein assembly, is abundant with complicated, ambiguous, and likely incomplete models. In an effort to clarify the compositions of NCBP1-, 2- and 3-related macromolecular assemblies, we have applied an affinity capture-based interactome screen where the experimental design and data processing have been modified to quantitatively identify interactome differences between targets under a range of experimental conditions. This study generated a comprehensive view of NCBP-protein interactions in the ribonucleoprotein context and demonstrates the potential of our approach to benefit the interpretation of complex biological pathways.
Assuntos
Complexo Proteico Nuclear de Ligação ao Cap/genética , Proteínas Nucleares/genética , Proteoma/genética , Proteínas de Ligação ao Cap de RNA/genética , Citoplasma/imunologia , Complexo Multienzimático de Ribonucleases do Exossomo/genética , Humanos , Proteômica/métodos , Capuzes de RNA/genética , RNA Polimerase II/genéticaRESUMO
BACKGROUND: Long interspersed element-1 (LINE-1, L1) is the major driver of mobile DNA activity in modern humans. When expressed, LINE-1 loci produce bicistronic transcripts encoding two proteins essential for retrotransposition, ORF1p and ORF2p. Many types of human cancers are characterized by L1 promoter hypomethylation, L1 transcription, L1 ORF1p protein expression, and somatic L1 retrotransposition. ORF2p encodes the endonuclease and reverse transcriptase activities required for L1 retrotransposition. Its expression is poorly characterized in human tissues and cell lines. RESULTS: We report mass spectrometry-based tumor proteome profiling studies wherein ORF2p eludes detection. To test whether ORF2p could be detected with specific reagents, we developed and validated five rabbit monoclonal antibodies with immunoreactivity for specific epitopes on the protein. These reagents readily detect ectopic ORF2p expressed from bicistronic L1 constructs. However, endogenous ORF2p is not detected in human tumor samples or cell lines by western blot, immunoprecipitation, or immunohistochemistry despite high levels of ORF1p expression. Moreover, we report endogenous ORF1p-associated interactomes, affinity isolated from colorectal cancers, wherein we similarly fail to detect ORF2p. These samples include primary tumors harboring hundreds of somatically acquired L1 insertions. The new data are available via ProteomeXchange with identifier PXD013743. CONCLUSIONS: Although somatic retrotransposition provides unequivocal genetic evidence for the expression of ORF2p in human cancers, we are unable to directly measure its presence using several standard methods. Experimental systems have previously indicated an unequal stoichiometry between ORF1p and ORF2p, but in vivo, the expression of these two proteins may be more strikingly uncoupled. These findings are consistent with observations that ORF2p is not tolerable for cell growth.
RESUMO
Long Interspersed Nuclear Element-1 (LINE-1, L1) constitutes a family of autonomous, self-replicating genetic elements known as retrotransposons. Although most are inactive, copious L1 sequences populate the human genome. L1s proliferate in a 'copy-and-paste' fashion through an RNA intermediate; a full-length L1 transcript is ~6,000 nucleotides long and functions as a bicistronic mRNA that encodes and assembles in cis with two main polypeptides, ORF1p and ORF2p, forming a ribonucleoprotein (RNP); L1 RNPs also interact with a wide range of host factors in positive and negative regulatory capacities. The following protocol describes an approach to affinity enrich ectopically expressed L1 RNPs and, using RNases, release the fraction of protein that depends upon the presence of intact RNA for retention in the immobilized macromolecules.
RESUMO
BACKGROUND: Metagenomic surveys of human microbiota are becoming increasingly widespread in academic research as well as in food and pharmaceutical industries and clinical context. Intuitive tools for investigating experimental data are of high interest to researchers. RESULTS: Knomics-Biota is a web-based resource for exploratory analysis of human gut metagenomes. Users can generate and share analytical reports corresponding to common experimental schemes (like case-control study or paired comparison). Interactive visualizations and statistical analysis are provided in association with the external factors and in the context of thousands of publicly available datasets arranged into thematic collections. The web-service is available at https://biota.knomics.ru. CONCLUSIONS: Knomics-Biota web service is a comprehensive tool for interactive metagenomic data analysis.
RESUMO
GWAS have identified >200 risk loci for Inflammatory Bowel Disease (IBD). The majority of disease associations are known to be driven by regulatory variants. To identify the putative causative genes that are perturbed by these variants, we generate a large transcriptome data set (nine disease-relevant cell types) and identify 23,650 cis-eQTL. We show that these are determined by â¼9720 regulatory modules, of which â¼3000 operate in multiple tissues and â¼970 on multiple genes. We identify regulatory modules that drive the disease association for 63 of the 200 risk loci, and show that these are enriched in multigenic modules. Based on these analyses, we resequence 45 of the corresponding 100 candidate genes in 6600 Crohn disease (CD) cases and 5500 controls, and show with burden tests that they include likely causative genes. Our analyses indicate that ≥10-fold larger sample sizes will be required to demonstrate the causality of individual genes using this approach.
Assuntos
Doenças Inflamatórias Intestinais/genética , Herança Multifatorial , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Coortes , Doença de Crohn/genética , Feminino , Perfilação da Expressão Gênica , Estudos de Associação Genética , Predisposição Genética para Doença , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Análise de Sequência de DNARESUMO
Personalized nutrition is of increasing interest to individuals actively monitoring their health. The relations between the duration of diet intervention and the effects on gut microbiota have yet to be elucidated. Here we examined the associations of short-term dietary changes, long-term dietary habits and lifestyle with gut microbiota. Stool samples from 248 citizen-science volunteers were collected before and after a self-reported 2-week personalized diet intervention, then analyzed using 16S rRNA sequencing. Considerable correlations between long-term dietary habits and gut community structure were detected. A higher intake of vegetables and fruits was associated with increased levels of butyrate-producing Clostridiales and higher community richness. A paired comparison of the metagenomes before and after the 2-week intervention showed that even a brief, uncontrolled intervention produced profound changes in community structure: resulting in decreased levels of Bacteroidaceae, Porphyromonadaceae and Rikenellaceae families and decreased alpha-diversity coupled with an increase of Methanobrevibacter, Bifidobacterium, Clostridium and butyrate-producing Lachnospiraceae- as well as the prevalence of a permatype (a bootstrapping-based variation of enterotype) associated with a higher diversity of diet. The response of microbiota to the intervention was dependent on the initial microbiota state. These findings pave the way for the development of an individualized diet.
Assuntos
Dieta , Microbioma Gastrointestinal , Bacteroidetes/genética , Bacteroidetes/isolamento & purificação , Bifidobacterium/genética , Bifidobacterium/isolamento & purificação , Clostridium/genética , Clostridium/isolamento & purificação , Análise por Conglomerados , Fezes/química , Fezes/microbiologia , Humanos , Metagenoma , Methanobrevibacter/genética , Methanobrevibacter/isolamento & purificação , RNA Ribossômico 16S/genética , Tamanho da Amostra , Análise de Sequência de DNARESUMO
Long Interspersed Nuclear Element-1 (LINE-1, L1) is a mobile genetic element active in human genomes. L1-encoded ORF1 and ORF2 proteins bind L1 RNAs, forming ribonucleoproteins (RNPs). These RNPs interact with diverse host proteins, some repressive and others required for the L1 lifecycle. Using differential affinity purifications, quantitative mass spectrometry, and next generation RNA sequencing, we have characterized the proteins and nucleic acids associated with distinctive, enzymatically active L1 macromolecular complexes. Among them, we describe a cytoplasmic intermediate that we hypothesize to be the canonical ORF1p/ORF2p/L1-RNA-containing RNP, and we describe a nuclear population containing ORF2p, but lacking ORF1p, which likely contains host factors participating in target-primed reverse transcription.
Assuntos
Endonucleases/análise , Elementos Nucleotídeos Longos e Dispersos , Substâncias Macromoleculares/química , DNA Polimerase Dirigida por RNA/análise , RNA/análise , Ribonucleoproteínas/análise , Cromatografia de Afinidade , Células HeLa , Humanos , Espectrometria de MassasRESUMO
Melioribacter roseus, a representative of recently proposed Ignavibacteriae phylum, is a metabolically versatile thermophilic bacterium, inhabiting subsurface biosphere of the West-Siberian megabasin and capable of growing on various substrates and electron acceptors. Genomic analysis followed by inhibitor studies and membrane potential measurements of aerobically grown M. roseus cells revealed the activity of aerobic respiratory electron transfer chain comprised of respiratory complexes I and IV, and an alternative complex III. Phylogeny reconstruction revealed that oxygen reductases belonged to atypical cc(o/b)o3 -type and canonical cbb3 -type cytochrome oxidases. Also, two molybdoenzymes of M. roseus were affiliated either with Ttr or Psr/Phs clades, but not with typical respiratory arsenate reductases of the Arr clade. Expression profiling, both at transcripts and protein level, allowed us to assign the role of the terminal respiratory oxidase under atmospheric oxygen concentration for the cc(o/b)o3 cytochrome oxidase, previously proposed to serve for oxygen detoxification only. Transcriptomic analysis revealed the involvement of both molybdoenzymes of M. roseus in As(V) respiration, yet differences in the genomic context of their gene clusters allow to hypothesize about their distinct roles in arsenate metabolism with the 'Psr/Phs'-type molybdoenzyme being the most probable candidate respiratory arsenate reductase. Basing on multi-omics data, the pathways for aerobic and arsenate respiration were proposed. Our results start to bridge the vigorously increasing gap between homology-based predictions and experimentally verified metabolic processes, what is especially important for understudied microorganisms of novel lineages from deep subsurface environments of Eurasia, which remained separated from the rest of the biosphere for several geological periods.
RESUMO
Numerous studies are devoted to the intestinal microbiota and intercellular communication maintaining homeostasis. In this regard, vesicles secreted by bacteria represent one of the most popular topics for research. For example, the outer membrane vesicles (OMVs) of Bacteroides fragilis play an important nutritional role with respect to other microorganisms and promote anti-inflammatory effects on immune cells. However, toxigenic B. fragilis (ETBF) contributes to bowel disease, even causing colon cancer. If nontoxigenic B. fragilis (NTBF) vesicles exert a beneficial effect on the intestine, it is likely that ETBF vesicles can be utilized for potential pathogenic implementation. To confirm this possibility, we performed comparative proteomic HPLC-MS/MS analysis of vesicles isolated from ETBF and NTBF. Furthermore, we performed, for the first time, HPLC-MS/MS and GS-MS comparative metabolomic analysis for the vesicles isolated from both strains with subsequent reconstruction of the vesicle metabolic pathways. We utilized fluxomic experiments to validate the reconstructed biochemical reaction activities and finally observed considerable difference in the vesicle proteome and metabolome profiles. Compared with NTBF OMVs, metabolic activity of ETBF OMVs provides their similarity to micro reactors that are likely to be used for long-term persistence and implementing pathogenic potential in the host.
Assuntos
Bacteroides fragilis/citologia , Metabolômica/métodos , Vesículas Secretórias/metabolismo , Bacteroides fragilis/patogenicidade , Cromatografia Líquida de Alta Pressão , Redes e Vias Metabólicas , Espectrometria de Massas em TandemRESUMO
Bacteria of class Mollicutes (mycoplasmas) feature significant genome reduction which makes them good model organisms for systems biology studies. Previously we demonstrated, that drastic transcriptional response of mycoplasmas to stress results in a very limited response on the level of protein. In this study we used heat stress model of M. gallisepticum and ribosome profiling to elucidate the process of genetic information transfer under stress. We found that under heat stress ribosomes demonstrate selectivity towards mRNA binding. We identified that heat stress response may be divided into two groups on the basis of absolute transcript abundance and fold-change in the translatome. One represents a noise-like response and another is likely an adaptive one. The latter include ClpB chaperone, cell division cluster, homologs of immunoblocking proteins and short ORFs with unknown function. We found that previously identified read-through of terminators contributes to the upregulation of transcripts in the translatome as well. In addition we identified that ribosomes of M. gallisepticum undergo reorganization under the heat stress. The most notable event is decrease of the amount of associated HU protein. In conclusion, only changes of few adaptive transcripts significantly impact translatome, while widespread noise-like transcription plays insignificant role in translation during stress.
Assuntos
Adaptação Fisiológica/genética , Resposta ao Choque Térmico/genética , Mycoplasma gallisepticum/genética , Ribossomos/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sequência de Bases , Cromatografia Líquida de Alta Pressão , Perfilação da Expressão Gênica/métodos , Regulação Bacteriana da Expressão Gênica , Temperatura Alta , Mycoplasma gallisepticum/metabolismo , Biossíntese de Proteínas/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Estresse Fisiológico , Espectrometria de Massas em TandemRESUMO
BACKGROUND: Proteomics of bacterial pathogens is a developing field exploring microbial physiology, gene expression and the complex interactions between bacteria and their hosts. One of the complications in proteomic approach is micro- and macro-heterogeneity of bacterial species, which makes it impossible to build a comprehensive database of bacterial genomes for identification, while most of the existing algorithms rely largely on genomic data. RESULTS: Here we present a large scale study of identification of single amino acid polymorphisms between bacterial strains. An ad hoc method was developed based on MS/MS spectra comparison without the support of a genomic database. Whole-genome sequencing was used to validate the accuracy of polymorphism detection. Several approaches presented earlier to the proteomics community as useful for polymorphism detection were tested on isolates of Helicobacter pylori, Neisseria gonorrhoeae and Escherichia coli. CONCLUSION: The developed method represents a perspective approach in the field of bacterial proteomics allowing to identify hundreds of peptides with novel SAPs from a single proteome.
Assuntos
Algoritmos , Aminoácidos/metabolismo , Bactérias/metabolismo , Proteínas de Bactérias/metabolismo , Proteoma/análise , Proteômica/métodos , Substituição de Aminoácidos , Aminoácidos/química , Aminoácidos/genética , Bases de Dados de Proteínas , Genoma Bacteriano , Genômica/métodos , Mutação/genética , Fragmentos de Peptídeos/análise , Espectrometria de Massas em Tandem/métodosRESUMO
Beijing B0/W148, a "successful" clone of Mycobacterium tuberculosis, is widespread in the Russian Federation and some countries of the former Soviet Union. Here, we used label-free gel-LC-MS/MS shotgun proteomics to discover features of Beijing B0/W148 strains that could explain their success. Qualitative and quantitative proteome analyses of Beijing B0/W148 strains allowed us to identify 1,868 proteins, including 266 that were differentially abundant compared with the control strain H37Rv. To predict the biological effects of the observed differences in protein abundances, we performed Gene Ontology analysis together with analysis of protein-DNA interactions using a gene regulatory network. Our results demonstrate that Beijing B0/W148 strains have increased levels of enzymes responsible for long-chain fatty acid biosynthesis, along with a coincident decrease in the abundance of proteins responsible for their degradation. Together with high levels of HsaA (Rv3570c) protein, involved in steroid degradation, these findings provide a possible explanation for the increased transmissibility of Beijing B0/W148 strains and their survival in host macrophages. Among other, we confirmed a very low level of the SseA (Rv3283) protein in Beijing B0/W148 characteristic for all «modern¼ Beijing strains, which could lead to increased DNA oxidative damage, accumulation of mutations, and potentially facilitate the development of drug resistance.
Assuntos
Proteínas de Bactérias/análise , Mycobacterium tuberculosis/química , Proteoma/análise , Cromatografia Líquida , Ontologia Genética , Redes Reguladoras de Genes , Espectrometria de Massas em Tandem , Fatores de Virulência/análiseRESUMO
The fragilysin (BFT) is a protein secreted by enterotoxigenic Bacteroides fragilis strains. BFT contains zinc-binding motif which was found in the metzincins family of metalloproteinases. In this study, we generated three known recombinant isoforms of BFT using Escherichia coli, tested their activity and examined whether E-cadherin is a substrate for BFTs. BFT treatment of HT-29 cells induced endogenous E-cadherin cleavage, and this BFT activity requires the native structure of zinc-binding motif. At the same time recombinant BFTs did not cleave recombinant E-cadherin or E-cadherin in isolated cell fractions. It indicates that E-cadherin may be not direct substrate for BFT. We also detected and identified proteins released into the cultural medium after HT-29 cells treatment with BFT. The role of these proteins in pathogenesis and cell response to BFT remains to be determined.
Assuntos
Caderinas/metabolismo , Metaloendopeptidases/metabolismo , Isoformas de Proteínas/metabolismo , Bacteroides fragilis/enzimologia , Bacteroides fragilis/genética , Linhagem Celular , Células Epiteliais/efeitos dos fármacos , Escherichia coli/genética , Escherichia coli/metabolismo , Humanos , Metaloendopeptidases/genética , Isoformas de Proteínas/genética , Proteólise , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismoRESUMO
BACKGROUND: Protein degradation is a basic cell process that operates in general protein turnover or to produce bioactive peptides. However, very little is known about the qualitative and quantitative composition of a plant cell peptidome, the actual result of this degradation. In this study we comprehensively analyzed a plant cell peptidome and systematically analyzed the peptide generation process. RESULTS: We thoroughly analyzed native peptide pools of Physcomitrella patens moss in two developmental stages as well as in protoplasts. Peptidomic analysis was supplemented by transcriptional profiling and quantitative analysis of precursor proteins. In total, over 20,000 unique endogenous peptides, ranging in size from 5 to 78 amino acid residues, were identified. We showed that in both the protonema and protoplast states, plastid proteins served as the main source of peptides and that their major fraction formed outside of chloroplasts. However, in general, the composition of peptide pools was very different between these cell types. In gametophores, stress-related proteins, e.g., late embryogenesis abundant proteins, were among the most productive precursors. The Driselase-mediated protonema conversion to protoplasts led to a peptide generation "burst", with a several-fold increase in the number of components in the latter. Degradation of plastid proteins in protoplasts was accompanied by suppression of photosynthetic activity. CONCLUSION: We suggest that peptide pools in plant cells are not merely a product of waste protein degradation, but may serve as important functional components for plant metabolism. We assume that the peptide "burst" is a form of biotic stress response that might produce peptides with antimicrobial activity from originally functional proteins. Potential functions of peptides in different developmental stages are discussed.
Assuntos
Bryopsida/citologia , Bryopsida/metabolismo , Células Germinativas Vegetais/citologia , Células Germinativas Vegetais/metabolismo , Peptídeos/metabolismo , Células Vegetais/metabolismo , Protoplastos/citologia , Bryopsida/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Fotossíntese , Proteínas de Plantas/metabolismo , Proteoma/metabolismo , Protoplastos/metabolismo , Alinhamento de SequênciaRESUMO
The avian bacterial pathogen Mycoplasma gallisepticum is a good model for systems studies due to small genome and simplicity of regulatory pathways. In this study, we used RNA-Seq and MS-based proteomics to accurately map coding sequences, transcription start sites (TSSs) and transcript 3'-ends (T3Es). We used obtained data to investigate roles of TSSs and T3Es in stress-induced transcriptional responses. We identified 1061 TSSs at a false discovery rate of 10% and showed that almost all transcription in M. gallisepticum is initiated from classic TATAAT promoters surrounded by A/T-rich sequences. Our analysis revealed the pronounced operon structure complexity: on average, each coding operon has one internal TSS and T3Es in addition to the primary ones. Our transcriptomic approach based on the intervals between the two nearest transcript ends allowed us to identify two classes of T3Es: strong, unregulated, hairpin-containing T3Es and weak, heat shock-regulated, hairpinless T3Es. Comparing gene expression levels under different conditions revealed widespread and divergent transcription regulation in M. gallisepticum. Modeling suggested that the core promoter structure plays an important role in gene expression regulation. We have shown that the heat stress activation of cryptic promoters combined with the hairpinless T3Es suppression leads to widespread, seemingly non-functional transcription.
Assuntos
Regulação Bacteriana da Expressão Gênica , Mycoplasma gallisepticum/genética , Transcrição Gênica , Proteínas de Bactérias/química , Perfilação da Expressão Gênica , Genoma Bacteriano , Temperatura Alta , Mycoplasma gallisepticum/metabolismo , Regiões Promotoras Genéticas , RNA Antissenso/biossíntese , RNA Bacteriano/química , RNA Bacteriano/metabolismo , Ribossomos/metabolismo , Estresse Fisiológico/genética , Sítio de Iniciação de Transcrição , Transformação BacterianaRESUMO
Ovarian cancer ascites is a native medium for cancer cells that allows investigation of their secretome in a natural environment. This medium is of interest as a promising source of potential biomarkers, and also as a medium for cell-cell communication. The aim of this study was to elucidate specific features of the malignant ascites metabolome and proteome. In order to omit components of the systemic response to ascites formation, we compared malignant ascites with cirrhosis ascites. Metabolome analysis revealed 41 components that differed significantly between malignant and cirrhosis ascites. Most of the identified cancer-specific metabolites are known to be important signaling molecules. Proteomic analysis identified 2096 and 1855 proteins in the ovarian cancer and cirrhosis ascites, respectively; 424 proteins were specific for the malignant ascites. Functional analysis of the proteome demonstrated that the major differences between cirrhosis and malignant ascites were observed for the cluster of spliceosomal proteins. Additionally, we demonstrate that several splicing RNAs were exclusively detected in malignant ascites, where they probably existed within protein complexes. This result was confirmed in vitro using an ovarian cancer cell line. Identification of spliceosomal proteins and RNAs in an extracellular medium is of particular interest; the finding suggests that they might play a role in the communication between cancer cells. In addition, malignant ascites contains a high number of exosomes that are known to play an important role in signal transduction. Thus our study reveals the specific features of malignant ascites that are associated with its function as a medium of intercellular communication.
Assuntos
Ascite/genética , Regulação Neoplásica da Expressão Gênica , Metaboloma/genética , Proteínas de Neoplasias/genética , Neoplasias Ovarianas/genética , Proteoma/genética , RNA Neoplásico/genética , Processamento Alternativo , Ascite/metabolismo , Ascite/patologia , Comunicação Celular , Linhagem Celular Tumoral , Exossomos/química , Exossomos/metabolismo , Feminino , Fibrose/genética , Fibrose/metabolismo , Fibrose/patologia , Humanos , Proteínas de Neoplasias/metabolismo , Neoplasias Ovarianas/metabolismo , Neoplasias Ovarianas/patologia , Proteoma/metabolismo , RNA Neoplásico/metabolismo , Transdução de Sinais , Spliceossomos/química , Spliceossomos/metabolismo , Vesículas Transportadoras/química , Vesículas Transportadoras/metabolismoRESUMO
Ribosomes contain a number of modifications in rRNA, the function of which is unclear. Here we show--using proteomic analysis and dual fluorescence reporter in vivo assays--that m(2)G966 and m(5)C967 in 16S rRNA of Escherichia coli ribosomes are necessary for correct attenuation of tryptophan (trp) operon. Expression of trp operon is upregulated in the strain where RsmD and RsmB methyltransferases were deleted, which results in the lack of m(2)G966 and m(5)C967 modifications. The upregulation requires the trpL attenuator, but is independent of the promotor of trp operon, ribosome binding site of the trpE gene, which follows trp attenuator and even Trp codons in the trpL sequence. Suboptimal translation initiation efficiency in the rsmB/rsmD knockout strain is likely to cause a delay in translation relative to transcription which causes misregulation of attenuation control of trp operon.
Assuntos
Escherichia coli/genética , Nucleotídeos/genética , Óperon/genética , RNA Ribossômico 16S/genética , Triptofano/genética , Sítios de Ligação/genética , Códon/genética , Regulação Bacteriana da Expressão Gênica/genética , Metiltransferases/genética , Regiões Promotoras Genéticas/genética , Biossíntese de Proteínas/genética , Proteômica/métodos , Ribossomos/genética , Transcrição Gênica/genética , Regulação para Cima/genéticaRESUMO
MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.