Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Environ Res ; 219: 115130, 2023 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-36563976

RESUMEN

Coastal seagrass meadows are essential in blue carbon and aquatic ecosystem services. However, this ecosystem has suffered severe eutrophication and destruction due to the expansion of aquaculture. Therefore, methods for the flourishing of seagrass are still being explored. Here, data from 49 public coastal surveys on the distribution of seagrass and seaweed around the onshore aquaculture facilities are revalidated, and an exceptional area where the seagrass Zostera marina thrives was found near the shore downstream of the onshore aquaculture facility. To evaluate the characteristics of the sediment for growing seagrass, physicochemical properties and bacterial ecological evaluations of the sediment were conducted. Evaluation of chemical properties in seagrass sediments confirmed a significant increase in total carbon and a decrease in zinc content. Association analysis and linear discriminant analysis refined bacterial candidates specified in seagrass overgrown- and nonovergrown-sediment. Energy landscape analysis indicated that the symbiotic bacterial groups of seagrass sediment were strongly affected by the distance close to the seagrass-growing aquaculture facility despite their bacterial population appearing to fluctuate seasonally. The bacterial population there showed an apparent decrease in the pathogen candidates belonging to the order Flavobacteriales. Moreover, structure equation modeling and a linear non-Gaussian acyclic model based on the machine learning data estimated an optimal sediment symbiotic bacterial group candidate for seagrass growth as follows: the Lachnospiraceae and Ruminococcaceae families as gut-inhabitant bacteria, Rhodobacteraceae as photosynthetic bacteria, and Desulfobulbaceae as cable bacteria modulating oxygen or nitrate reduction and oxidation of sulfide. These observations confer a novel perspective on the sediment symbiotic bacterial structures critical for blue carbon and low-pathogenic marine ecosystems in aquaculture.


Asunto(s)
Ecosistema , Zosteraceae , Humanos , Sedimentos Geológicos/análisis , Acuicultura , Carbono/análisis , Bacterias
2.
Int J Mol Sci ; 21(8)2020 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-32340198

RESUMEN

Nuclear magnetic resonance (NMR) spectroscopy is commonly used to characterize molecular complexity because it produces informative atomic-resolution data on the chemical structure and molecular mobility of samples non-invasively by means of various acquisition parameters and pulse programs. However, analyzing the accumulated NMR data of mixtures is challenging due to noise and signal overlap. Therefore, data-cleansing steps, such as quality checking, noise reduction, and signal deconvolution, are important processes before spectrum analysis. Here, we have developed an NMR measurement informatics tool for data cleansing that combines short-time Fourier transform (STFT; a time-frequency analytical method) and probabilistic sparse matrix factorization (PSMF) for signal deconvolution and noise factor analysis. Our tool can be applied to the original free induction decay (FID) signals of a one-dimensional NMR spectrum. We show that the signal deconvolution method reduces the noise of FID signals, increasing the signal-to-noise ratio (SNR) about tenfold, and its application to diffusion-edited spectra allows signals of macromolecules and unsuppressed small molecules to be separated by the length of the T2* relaxation time. Noise factor analysis of NMR datasets identified correlations between SNR and acquisition parameters, identifying major experimental factors that can lower SNR.


Asunto(s)
Espectroscopía de Resonancia Magnética/métodos , Espectroscopía de Resonancia Magnética/normas , Algoritmos , Análisis Factorial , Modelos Teóricos , Relación Señal-Ruido
3.
Plant J ; 90(3): 587-605, 2017 May.
Artículo en Inglés | MEDLINE | ID: mdl-28214361

RESUMEN

Information about transcription start sites (TSSs) provides baseline data for the analysis of promoter architecture. In this paper we used paired- and single-end deep sequencing to analyze Arabidopsis TSS tags from several libraries prepared from roots, shoots, flowers and etiolated seedlings. The clustering of approximately 33 million mapped TSS tags led to the identification of 324 461 promoters that covered 79.7% (21 672/27 206) of protein-coding genes in the Arabidopsis genome. In addition we identified intragenic, antisense and orphan promoters that were not associated with any gene models. Of these, intragenic promoters exhibited unique characteristics regarding dinucleotide sequences at TSSs and core promoter element composition, suggesting that these promoters use different mechanisms of transcriptional initiation. An analysis of base composition with regard to promoter position revealed a low GC content throughout the promoter region and several local strand biases that were evident for TATA-type promoters, but not for Coreless-type promoters. Most observed strand biases coincided with strand biases of single nucleotide polymorphism rate. Our analysis also revealed that transcription of a gene is supported by an average of 2.7 genic promoters, among which one specific promoter, designated as a top promoter, substantially determines the expression level of the gene.


Asunto(s)
Arabidopsis/genética , Regiones Promotoras Genéticas/genética , Sitio de Iniciación de la Transcripción/fisiología , Proteínas de Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas/genética , Regulación de la Expresión Génica de las Plantas/fisiología
4.
Plant Cell Physiol ; 58(1): e6, 2017 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-28069893

RESUMEN

Algae are smaller organisms than land plants and offer clear advantages in research over terrestrial species in terms of rapid production, short generation time and varied commercial applications. Thus, studies investigating the practical development of effective algal production are important and will improve our understanding of both aquatic and terrestrial plants. In this study we estimated multiple physicochemical and secondary structural properties of protein sequences, the predicted presence of post-translational modification (PTM) sites, and subcellular localization using a total of 510,123 protein sequences from the proteomes of 31 algal and three plant species. Algal species were broadly selected from green and red algae, glaucophytes, oomycetes, diatoms and other microalgal groups. The results were deposited in the Algal Protein Annotation Suite database (Alga-PrAS; http://alga-pras.riken.jp/), which can be freely accessed online.


Asunto(s)
Proteínas Algáceas/metabolismo , Bases de Datos de Proteínas , Microalgas/metabolismo , Proteoma/metabolismo , Proteínas Algáceas/clasificación , Chlorophyta/clasificación , Chlorophyta/metabolismo , Análisis por Conglomerados , Biología Computacional/métodos , Cyanophora/metabolismo , Diatomeas/clasificación , Diatomeas/metabolismo , Internet , Microalgas/clasificación , Oomicetos/clasificación , Oomicetos/metabolismo , Proteínas de Plantas/clasificación , Proteínas de Plantas/metabolismo , Plantas/clasificación , Plantas/metabolismo , Rhodophyta/clasificación , Rhodophyta/metabolismo
5.
J Plant Res ; 129(4): 711-726, 2016 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-27138000

RESUMEN

Cassava anthracnose disease (CAD), caused by the fungus Colletotrichum gloeosporioides f. sp. Manihotis, is a serious disease of cassava (Manihot esculenta) worldwide. In this study, we established a cassava oligonucleotide-DNA microarray representing 59,079 probes corresponding to approximately 30,000 genes based on original expressed sequence tags and RNA-seq information from cassava, and applied it to investigate the molecular mechanisms of resistance to fungal infection using two cassava cultivars, Huay Bong 60 (HB60, resistant to CAD) and Hanatee (HN, sensitive to CAD). Based on quantitative real-time reverse transcription PCR and expression profiling by the microarray, we showed that the expressions of various plant defense-related genes, such as pathogenesis-related (PR) genes, cell wall-related genes, detoxification enzyme, genes related to the response to bacterium, mitogen-activated protein kinase (MAPK), genes related to salicylic acid, jasmonic acid and ethylene pathways were higher in HB60 compared with HN. Our results indicated that the induction of PR genes in HB60 by fungal infection and the higher expressions of defense response-related genes in HB60 compared with HN are likely responsible for the fungal resistance in HB60. We also showed that the use of our cassava oligo microarray could improve our understanding of cassava molecular mechanisms related to environmental responses and development, and advance the molecular breeding of useful cassava plants.


Asunto(s)
Colletotrichum/fisiología , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica de las Plantas , Manihot/genética , Manihot/microbiología , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Enfermedades de las Plantas/genética , Enfermedades de las Plantas/microbiología , Ciclopentanos/metabolismo , Etilenos/metabolismo , Ontología de Genes , Genes de Plantas , Oxilipinas/metabolismo , Reacción en Cadena en Tiempo Real de la Polimerasa , Reproducibilidad de los Resultados , Ácido Salicílico/metabolismo , Transducción de Señal/genética , Regulación hacia Arriba/genética
6.
Plant Cell Physiol ; 56(1): e11, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25435546

RESUMEN

Arabidopsis thaliana is an important model species for studies of plant gene functions. Research on Arabidopsis has resulted in the generation of high-quality genome sequences, annotations and related post-genomic studies. The amount of annotation, such as gene-coding regions and structures, is steadily growing in the field of plant research. In contrast to the genomics resource of animals and microorganisms, there are still some difficulties with characterization of some gene functions in plant genomics studies. The acquisition of information on protein structure can help elucidate the corresponding gene function because proteins encoded in the genome possess highly specific structures and functions. In this study, we calculated multiple physicochemical and secondary structural parameters of protein sequences, including length, hydrophobicity, the amount of secondary structure, the number of intrinsically disordered regions (IDRs) and the predicted presence of transmembrane helices and signal peptides, using a total of 208,333 protein sequences from the genomes of six representative plant species, Arabidopsis thaliana, Glycine max (soybean), Populus trichocarpa (poplar), Oryza sativa (rice), Physcomitrella patens (moss) and Cyanidioschyzon merolae (alga). Using the PASS tool and the Rosetta Stone method, we annotated the presence of novel functional regions in 1,732 protein sequences that included unannotated sequences from the Arabidopsis and rice proteomes. These results were organized into the Plant Protein Annotation Suite database (Plant-PrAS), which can be freely accessed online at http://plant-pras.riken.jp/.


Asunto(s)
Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información , Proteínas de Plantas/química , Plantas/metabolismo , Proteoma , Arabidopsis/genética , Arabidopsis/metabolismo , Bryopsida/genética , Bryopsida/metabolismo , Mapeo Cromosómico , Internet , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Oryza/genética , Oryza/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Plantas/genética , Populus/genética , Populus/metabolismo , Rhodophyta/genética , Rhodophyta/metabolismo
7.
Bioinformatics ; 30(8): 1095-1103, 2014 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-24403539

RESUMEN

MOTIVATION: Protein structural research in plants lags behind that in animal and bacterial species. This lag concerns both the structural analysis of individual proteins and the proteome-wide characterization of structure-related properties. Until now, no systematic study concerning the relationships between protein disorder and multiple post-translational modifications (PTMs) in plants has been presented. RESULTS: In this work, we calculated the global degree of intrinsic disorder in the complete proteomes of eight typical monocotyledonous and dicotyledonous plant species. We further predicted multiple sites for phosphorylation, glycosylation, acetylation and methylation and examined the correlations of protein disorder with the presence of the predicted PTM sites. It was found that phosphorylation, acetylation and O-glycosylation displayed a clear preference for occurrence in disordered regions of plant proteins. In contrast, methylation tended to avoid disordered sequence, whereas N-glycosylation did not show a universal structural preference in monocotyledonous and dicotyledonous plants. In addition, the analysis performed revealed significant differences between the integral characteristics of monocot and dicot proteomes. They included elevated disorder degree, increased rate of O-glycosylation and R-methylation, decreased rate of N-glycosylation, K-acetylation and K-methylation in monocotyledonous plant species, as compared with dicotyledonous species. Altogether, our study provides the most compelling evidence so far for the connection between protein disorder and multiple PTMs in plants. CONTACT: tokmak@phoenix.kobe-u.ac.jp or tetsuya.sakurai@riken.jp Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas de Plantas/química , Plantas/química , Procesamiento Proteico-Postraduccional , Acetilación , Glicosilación , Metilación , Fosforilación , Proteoma/química
8.
Int J Mol Sci ; 16(8): 19812-35, 2015 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-26307970

RESUMEN

Recent proteome analyses have reported that intrinsically disordered regions (IDRs) of proteins play important roles in biological processes. In higher plants whose genomes have been sequenced, the correlation between IDRs and post-translational modifications (PTMs) has been reported. The genomes of various eukaryotic algae as common ancestors of plants have also been sequenced. However, no analysis of the relationship to protein properties such as structure and PTMs in algae has been reported. Here, we describe correlations between IDR content and the number of PTM sites for phosphorylation, glycosylation, and ubiquitination, and between IDR content and regions rich in proline, glutamic acid, serine, and threonine (PEST) and transmembrane helices in the sequences of 20 algae proteomes. Phosphorylation, O-glycosylation, ubiquitination, and PEST preferentially occurred in disordered regions. In contrast, transmembrane helices were favored in ordered regions. N-glycosylation tended to occur in ordered regions in most of the studied algae; however, it correlated positively with disordered protein content in diatoms. Additionally, we observed that disordered protein content and the number of PTM sites were significantly increased in the species-specific protein clusters compared to common protein clusters among the algae. Moreover, there were specific relationships between IDRs and PTMs among the algae from different groups.


Asunto(s)
Proteínas Algáceas/metabolismo , Biología Computacional/métodos , Proteínas Intrínsecamente Desordenadas/metabolismo , Procesamiento Proteico-Postraduccional , Proteínas Algáceas/química , Chlorophyta/metabolismo , Simulación por Computador , Diatomeas/metabolismo , Proteínas Intrínsecamente Desordenadas/química , Oomicetos/metabolismo , Conformación Proteica , Rhodophyta/metabolismo , Especificidad de la Especie
9.
Plant Cell Physiol ; 55(1): e4, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24272250

RESUMEN

Arabidopsis thaliana is one of the most popular experimental plants. However, only 40% of its genes have at least one experimental Gene Ontology (GO) annotation assigned. Systematic observation of mutant phenotypes is an important technique for elucidating gene functions. Indeed, several large-scale phenotypic analyses have been performed and have generated phenotypic data sets from many Arabidopsis mutant lines and overexpressing lines, which are freely available online. Since each Arabidopsis mutant line database uses individual phenotype expression, the differences in the structured term sets used by each database make it difficult to compare data sets and make it impossible to search across databases. Therefore, we obtained publicly available information for a total of 66,209 Arabidopsis mutant lines, including loss-of-function (RATM and TARAPPER) and gain-of-function (AtFOX and OsFOX) lines, and integrated the phenotype data by mapping the descriptions onto Plant Ontology (PO) and Phenotypic Quality Ontology (PATO) terms. This approach made it possible to manage the four different phenotype databases as one large data set. Here, we report a publicly accessible web-based database, the RIKEN Arabidopsis Genome Encyclopedia II (RARGE II; http://rarge-v2.psc.riken.jp/), in which all of the data described in this study are included. Using the database, we demonstrated consistency (in terms of protein function) with a previous study and identified the presumed function of an unknown gene. We provide examples of AT1G21600, which is a subunit in the plastid-encoded RNA polymerase complex, and AT5G56980, which is related to the jasmonic acid signaling pathway.


Asunto(s)
Arabidopsis/anatomía & histología , Arabidopsis/genética , Bases de Datos Genéticas , Mutación/genética , Carácter Cuantitativo Heredable , Vocabulario Controlado , Ontología de Genes , Genes de Plantas , Internet , Anotación de Secuencia Molecular , Fenotipo , Interfaz Usuario-Computador
10.
MethodsX ; 12: 102528, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38274701

RESUMEN

The development of data science has been needed in environmental fields such as marine, weather, and soil data. In general, the datasets are large in some cases, but they are often small because they contain observation data that the analyses themselves are limited. In such a case, the data are statistically evaluated by increasing or decreasing the levels of factors using differential analysis, resulting in the essential factors are estimated. However, there is no consistent approach to the means of assessing strong associations as a group between factors. Causal inference method has the possibility to output effective results for small data, and the results are expected to provide important information for understanding the potential highly association between factors, not necessarily the inference with big data. Here, we describe essential checkpoints and settings for the calculation by a direct method for learning a linear non-Gaussian structural equation model (DirectLiNGAM) and validation methods for the calculation results by using DirectLiNGAM with small-scale model data as an additional discussion of DirectLiNGAM portion of the related research article. Thus, this study provides the statistical validation methods for the association networks, treatments, and interventions for structural inference as a group of essential factors.•Causal inference with DirectLiNGAM•Validation of correlation coefficient and feature importance•Validation using causal effect object and propensity scores.

11.
J Biol Chem ; 287(32): 27106-16, 2012 Aug 03.
Artículo en Inglés | MEDLINE | ID: mdl-22674579

RESUMEN

Post-translational modifications (PTMs) are required for proper folding of many proteins. The low capacity for PTMs hinders the production of heterologous proteins in the widely used prokaryotic systems of protein synthesis. Until now, a systematic and comprehensive study concerning the specific effects of individual PTMs on heterologous protein synthesis has not been presented. To address this issue, we expressed 1488 human proteins and their domains in a bacterial cell-free system, and we examined the correlation of the expression yields with the presence of multiple PTM sites bioinformatically predicted in these proteins. This approach revealed a number of previously unknown statistically significant correlations. Prediction of some PTMs, such as myristoylation, glycosylation, palmitoylation, and disulfide bond formation, was found to significantly worsen protein amenability to soluble expression. The presence of other PTMs, such as aspartyl hydroxylation, C-terminal amidation, and Tyr sulfation, did not correlate with the yield of heterologous protein expression. Surprisingly, the predicted presence of several PTMs, such as phosphorylation, ubiquitination, SUMOylation, and prenylation, was associated with the increased production of properly folded soluble proteins. The plausible rationales for the existence of the observed correlations are presented. Our findings suggest that identification of potential PTMs in polypeptide sequences can be of practical use for predicting expression success and optimizing heterologous protein synthesis. In sum, this study provides the most compelling evidence so far for the role of multiple PTMs in the stability and solubility of heterologously expressed recombinant proteins.


Asunto(s)
Biosíntesis de Proteínas , Procesamiento Proteico-Postraduccional , Fosforilación
12.
Sci Rep ; 13(1): 6359, 2023 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-37076584

RESUMEN

Reducing antibiotic usage among livestock animals to prevent antimicrobial resistance has become an urgent issue worldwide. This study evaluated the effects of administering chlortetracycline (CTC), a versatile antibacterial agent, on the performance, blood components, fecal microbiota, and organic acid concentrations of calves. Japanese Black calves were fed with milk replacers containing CTC at 10 g/kg (CON group) or 0 g/kg (EXP group). Growth performance was not affected by CTC administration. However, CTC administration altered the correlation between fecal organic acids and bacterial genera. Machine learning (ML) methods such as association analysis, linear discriminant analysis, and energy landscape analysis revealed that CTC administration affected populations of various types of fecal bacteria. Interestingly, the abundance of several methane-producing bacteria at 60 days of age was high in the CON group, and the abundance of Lachnospiraceae, a butyrate-producing bacterium, was high in the EXP group. Furthermore, statistical causal inference based on ML data estimated that CTC treatment affected the entire intestinal environment, potentially suppressing butyrate production, which may be attributed to methanogens in feces. Thus, these observations highlight the multiple harmful impacts of antibiotics on the intestinal health of calves and the potential production of greenhouse gases by calves.


Asunto(s)
Antibacterianos , Clortetraciclina , Animales , Bovinos , Antibacterianos/farmacología , Disbiosis , Clortetraciclina/farmacología , Heces/microbiología , Bacterias , Butiratos , Alimentación Animal/análisis , Dieta/veterinaria
13.
ISME Commun ; 3(1): 28, 2023 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-37002405

RESUMEN

Compost is used worldwide as a soil conditioner for crops, but its functions have still been explored. Here, the omics profiles of carrots were investigated, as a root vegetable plant model, in a field amended with compost fermented with thermophilic Bacillaceae for growth and quality indices. Exposure to compost significantly increased the productivity, antioxidant activity, color, and taste of the carrot root and altered the soil bacterial composition with the levels of characteristic metabolites of the leaf, root, and soil. Based on the data, structural equation modeling (SEM) estimated that amino acids, antioxidant activity, flavonoids and/or carotenoids in plants were optimally linked by exposure to compost. The SEM of the soil estimated that the genus Paenibacillus and nitrogen compounds were optimally involved during exposure. These estimates did not show a contradiction between the whole genomic analysis of compost-derived Paenibacillus isolates and the bioactivity data, inferring the presence of a complex cascade of plant growth-promoting effects and modulation of the nitrogen cycle by the compost itself. These observations have provided information on the qualitative indicators of compost in complex soil-plant interactions and offer a new perspective for chemically independent sustainable agriculture through the efficient use of natural nitrogen.

14.
Sci Rep ; 12(1): 10558, 2022 06 22.
Artículo en Inglés | MEDLINE | ID: mdl-35732681

RESUMEN

In the development of polymer materials, it is an important issue to explore the complex relationships between domain structure and physical properties. In the domain structure analysis of polymer materials, 1H-static solid-state NMR (ssNMR) spectra can provide information on mobile, rigid, and intermediate domains. But estimation of domain structure from its analysis is difficult due to the wide overlap of spectra from multiple domains. Therefore, we have developed a materials informatics approach that combines the domain modeling ( http://dmar.riken.jp/matrigica/ ) and the integrated analysis of meta-information (the elements, functional groups, additives, and physical properties) in polymer materials. Firstly, the 1H-static ssNMR data of 120 polymer materials were subjected to a short-time Fourier transform to obtain frequency, intensity, and T2 relaxation time for domains with different mobility. The average T2 relaxation time of each domain is 0.96 ms for Mobile, 0.55 ms for Intermediate (Mobile), 0.32 ms for Intermediate (Rigid), and 0.11 ms for Rigid. Secondly, the estimated domain proportions were integrated with meta-information such as elements, functional group and thermophysical properties and was analyzed using a self-organization map and market basket analysis. This proposed method can contribute to explore structure-property relationships of polymer materials with multiple domains.


Asunto(s)
Imagen por Resonancia Magnética , Polímeros , Informática , Espectroscopía de Resonancia Magnética/métodos , Polímeros/química
15.
Sci Total Environ ; 836: 155520, 2022 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-35508250

RESUMEN

Effective biological utilization of wood biomass is necessary worldwide. Since several insect larvae can use wood biomass as a nutrient source, studies on their digestive microbial structures are expected to reveal a novel rule underlying wood biomass processing. Here, structural inferences for inhabitant bacteria involved in carbon and nitrogen metabolism for beetle larvae, an insect model, were performed to explore the potential rules. Bacterial analysis of larval feces showed enrichment of the phyla Chroloflexi, Gemmatimonadetes, and Planctomycetes, and the genera Bradyrhizobium, Chonella, Corallococcus, Gemmata, Hyphomicrobium, Lutibacterium, Paenibacillus, and Rhodoplanes, as bacteria potential involved in plant growth promotion, nitrogen cycle modulation, and/or environmental protection. The fecal abundances of these bacteria were not necessarily positively correlated with their abundances in the habitat, indicating that they were selectively enriched in the feces of the larvae. Correlation and association analyses predicted that common fecal bacteria might affect carbon and nitrogen metabolism. Based on these hypotheses, structural equation modeling (SEM) statistically estimated that inhabitant bacterial groups involved in carbon and nitrogen metabolism were composed of the phylum Gemmatimonadetes and Planctomycetes, and the genera Bradyrhizobium, Corallococcus, Gemmata, and Paenibacillus, which were among the fecal-enriched bacteria. Nevertheless, the selected common bacteria, i.e., the phyla Acidobacteria, Armatimonadetes, and Bacteroidetes and the genera Candidatus Solibacter, Devosia, Fimbriimonas, Gemmatimonas Opitutus, Sphingobium, and Methanobacterium, were necessary to obtain good fit indices in the SEM. In addition, the composition of the bacterial groups differed depending upon metabolic targets, carbon and nitrogen, and their stable isotopes, δ13C and δ15N, respectively. Thus, the statistically derived causal structural models highlighted that the larval fecal-enriched bacteria and common symbiotic bacteria might selectively play a role in wood biomass carbon and nitrogen metabolism. This information could confer a new perspective that helps us use wood biomass more efficiently and might stimulate innovation in environmental industries in the future.


Asunto(s)
Carbono , Escarabajos , Acidobacteria/metabolismo , Animales , Bacterias/metabolismo , Carbono/metabolismo , Escarabajos/metabolismo , Larva/metabolismo , Nitrógeno/metabolismo , Madera/metabolismo
16.
Plant Cell Physiol ; 52(2): 265-73, 2011 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21186176

RESUMEN

Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named 'RiceFOX'. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.


Asunto(s)
Arabidopsis/genética , ADN Complementario/genética , Bases de Datos Genéticas , Oryza/genética , Arabidopsis/metabolismo , Análisis por Conglomerados , ADN de Plantas/genética , Genoma de Planta , Internet , Plantas Modificadas Genéticamente/genética , Plantas Modificadas Genéticamente/metabolismo , Análisis de Secuencia de ADN , Interfaz Usuario-Computador
17.
FASEB J ; 24(4): 1095-104, 2010 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19940260

RESUMEN

High-throughput cell-free protein synthesis is being used increasingly in structural/functional genomics projects. However, the factors determining expression success are poorly understood. Here, we evaluated the expression of 3066 human proteins and their domains in a bacterial cell-free system and analyzed the correlation of protein expression with 39 physicochemical and structural properties of proteins. As a result of the bioinformatics analysis performed, we determined the 18 most influential features that affect protein amenability to cell-free expression. They include protein length; hydrophobicity; pI; content of charged, nonpolar, and aromatic residues;, cysteine content; solvent accessibility; presence of coiled coil; content of intrinsically disordered and structured (alpha-helix and beta-sheet) sequence; number of disulfide bonds and functional domains; presence of transmembrane regions; PEST motifs; and signaling sequences. This study represents the first comprehensive bioinformatics analysis of heterologous protein synthesis in a cell-free system. The rules and correlations revealed here provide a plethora of important insights into rationalization of cell-free protein production and can be of practical use for protein engineering with the aim of increasing expression success.-Kurotani, A., Takagi, T., Toyama, M., Shirouzu, M., Yokoyama, S., Fukami, Y., Tokmakov, A. A. Comprehensive bioinformatics analysis of cell-free protein synthesis: identification of multiple protein properties that correlate with successful expression.


Asunto(s)
Modelos Teóricos , Biosíntesis de Proteínas/fisiología , Proteínas Recombinantes/biosíntesis , Proteínas Recombinantes/química , Secuencias de Aminoácidos , Sistema Libre de Células/química , Sistema Libre de Células/metabolismo , Biología Computacional/métodos , Escherichia coli/química , Escherichia coli/metabolismo , Humanos , Estructura Terciaria de Proteína
18.
ACS Omega ; 6(22): 14278-14287, 2021 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-34124451

RESUMEN

Materials informatics is an emerging field that allows us to predict the properties of materials and has been applied in various research and development fields, such as materials science. In particular, solubility factors such as the Hansen and Hildebrand solubility parameters (HSPs and SP, respectively) and Log P are important values for understanding the physical properties of various substances. In this study, we succeeded at establishing a solubility prediction tool using a unique machine learning method called the in-phase deep neural network (ip-DNN), which starts exclusively from the analytical input data (e.g., NMR information, refractive index, and density) to predict solubility by predicting intermediate elements, such as molecular components and molecular descriptors, in the multiple-step method. For improving the level of accuracy of the prediction, intermediate regression models were employed when performing in-phase machine learning. In addition, we developed a website dedicated to the established solubility prediction method, which is freely available at "http://dmar.riken.jp/matsolca/".

19.
Front Mol Biosci ; 8: 775736, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34912847

RESUMEN

The protein isoelectric point (pI) can be calculated from an amino acid sequence using computational analysis in a good agreement with experimental data. Availability of whole-genome sequences empowers comparative studies of proteome-wide pI distributions. It was found that the whole-proteome distributions of protein pI values are multimodal in different species. It was further hypothesized that the observed multimodality is associated with subcellular localization-specific differences in local pI distributions. Here, we overview the multimodality of proteome-wide pI distributions in different organisms focusing on the relationships between protein pI and subcellular localization. We also discuss the probable factors responsible for variation of the intracellular localization-specific pI profiles.

20.
BMC Bioinformatics ; 11: 113, 2010 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-20193068

RESUMEN

BACKGROUND: Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research. RESULTS: The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain. CONCLUSIONS: Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected a priori. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.


Asunto(s)
Modelos Teóricos , Proteínas/química , Proteómica/métodos , Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Proteoma/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA