Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 118
Filtrar
1.
Biotechnol J ; 19(1): e2300289, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38015079

RESUMO

Raman spectroscopy is widely used in monitoring and controlling cell cultivations for biopharmaceutical drug manufacturing. However, its implementation for culture monitoring in the cell line development stage has received little attention. Therefore, the impact of clonal differences, such as productivity and growth, on the prediction accuracy and transferability of Raman calibration models is not yet well described. Raman OPLS models were developed for predicting titer, glucose and lactate using eleven CHO clones from a single cell line. These clones exhibited diverse productivity and growth rates. The calibration models were evaluated for clone-related biases using clone-wise linear regression analysis on cross validated predictions. The results revealed that clonal differences did not affect the prediction of glucose and lactate, but titer models showed a significant clone-related bias, which remained even after applying variable selection methods. The bias was associated with clonal productivity and lead to increased prediction errors when titer models were transferred to cultivations with productivity levels outside the range of their training data. The findings demonstrate the feasibility of Raman-based monitoring of glucose and lactate in cell line development with high accuracy. However, accurate titer prediction requires careful consideration of clonal characteristics during model development.


Assuntos
Ácido Láctico , Análise Espectral Raman , Cricetinae , Animais , Células CHO , Cricetulus , Calibragem , Estudos de Viabilidade , Ácido Láctico/metabolismo , Análise Espectral Raman/métodos , Glucose/metabolismo , Células Clonais/metabolismo
2.
J Dairy Sci ; 106(11): 7407-7418, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37641350

RESUMO

Ripening is the most crucial process step in cheese manufacturing and constitutes multiple biochemical alterations that describe the final cheese quality and its perceived sensory attributes. The assessment of the cheese-ripening process is challenging and requires the effective analysis of a multitude of biochemical changes occurring during the process. This study monitored the biochemical and sensory attribute changes of paraffin wax-covered long-ripening hard cheeses (n = 79) during ripening by collecting samples at different stages of ripening. Near-infrared hyperspectral (NIR-HS) imaging, together with free amino acid, chemical composition, and sensory attributes, was studied to monitor the biochemical changes during the ripening process. Orthogonal projection-based multivariate calibration methods were used to characterize ripening-related and orthogonal components as well as the distribution map of chemical components. The results approve the NIR-HS imaging as a rapid tool for monitoring cheese maturity during ripening. Moreover, the pixelwise evaluation of images shows the homogeneity of cheese maturation at different stages of ripening. Among the chemical compositions, fat content and moisture are the most important variables correlating to NIR-HS images during the ripening process.

3.
Biotechnol J ; 17(12): e2200237, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36266999

RESUMO

BACKGROUND: Monoclonal antibodies (mAbs) are leading types of 'blockbuster' biotherapeutics worldwide; they have been successfully used to treat various cancers and chronic inflammatory and autoimmune diseases. Biotherapeutics process development and manufacturing are complicated due to lack of understanding the factors that impact cell productivity and product quality attributes. Understanding complex interactions between cells, media, and process parameters on the molecular level is essential to bring biomanufacturing to the next level. This can be achieved by analyzing cell culture metabolic levels connected to vital process parameters like viable cell density (VCD). However, VCD and metabolic profiles are dynamic parameters and inherently correlated with time, leading to a significant correlation without actual causality. Many time-series methods deal with such issues. However, with metabolic profiling, the number of measured variables vastly exceeds the number of experiments, making most of existing methods ill-suited and hard to interpret. METHODS AND MAJOR RESULTS: Here we propose an alternative workflow using hierarchical dimension reduction to visualize and interpret the relation between evolution of metabolic profiles and dynamic process parameters. The first step of proposed method is focused on finding predictive relation between metabolic profiles and process parameter at all time points using OPLS regression. For each time point, the p(corr) obtained from OPLS model is considered as a differential metabogram and is further assessed using principal components analysis (PCA). CONCLUSIONS: Compared to traditional batch modeling, applying proposed methodology on metabolic data from Chinese Hamster Ovary (CHO) antibody production characterized the dynamic relation between metabolic profiles and critical process parameters.


Assuntos
Metaboloma , Metabolômica , Cricetinae , Animais , Cricetulus , Células CHO , Técnicas de Cultura de Células/métodos
4.
Front Bioeng Biotechnol ; 10: 948905, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36072286

RESUMO

There is a growing interest in continuous processing of the biopharmaceutical industry. However, the technology transfer from traditional batch-based processes is considered a challenge as protocol and tools still remain to be established for their usage at the manufacturing scale. Here, we present a model-based approach to design optimized perfusion cultures of Chinese Hamster Ovary cells using only the knowledge captured during small-scale fed-batch experiments. The novelty of the proposed model lies in the simplicity of its structure. Thanks to the introduction of a new catch-all variable representing a bulk of by-products secreted by the cells during their cultivation, the model was able to successfully predict cellular behavior under different operating modes without changes in its formalism. To our knowledge, this is the first experimentally validated model capable, with a single set of parameters, to capture culture dynamic under different operating modes and at different scales.

5.
Comput Struct Biotechnol J ; 20: 3986-4002, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35983235

RESUMO

Subcellular localization of Ribonucleic Acid (RNA) molecules provide significant insights into the functionality of RNAs and helps to explore their association with various diseases. Predominantly developed single-compartment localization predictors (SCLPs) lack to demystify RNA association with diverse biochemical and pathological processes mainly happen through RNA co-localization in multiple compartments. Limited multi-compartment localization predictors (MCLPs) manage to produce decent performance only for target RNA class of particular sub-type. Further, existing computational approaches have limited practical significance and potential to optimize therapeutics due to the poor degree of model explainability. The paper in hand presents an explainable Long Short-Term Memory (LSTM) network "EL-RMLocNet", predictive performance and interpretability of which are optimized using a novel GeneticSeq2Vec statistical representation learning scheme and attention mechanism for accurate multi-compartment localization prediction of different RNAs solely using raw RNA sequences. GeneticSeq2Vec generates optimized statistical vectors of raw RNA sequences by capturing short and long range relations of nucleotide k-mers. Using sequence vectors generated by GeneticSeq2Vec scheme, Long Short Term Memory layers extract most informative features, weighting of which on the basis of discriminative potential for accurate multi-compartment localization prediction is performed using attention layer. Through reverse engineering, weights of statistical feature space are mapped to nucleotide k-mers patterns to make multi-compartment localization prediction decision making transparent and explainable for different RNA classes and species. Empirical evaluation indicates that EL-RMLocNet outperforms state-of-the-art predictor for subcellular localization prediction of 4 different RNA classes by an average accuracy figure of 8% for Homo Sapiens species and 6% for Mus Musculus species. EL-RMLocNet is freely available as a web server at (https://sds_genetic_analysis.opendfki.de/subcellular_loc/).

6.
Interdiscip Sci ; 14(4): 841-862, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35947255

RESUMO

BACKGROUND AND OBJECTIVE: Interactions of long non-coding ribonucleic acids (lncRNAs) with micro-ribonucleic acids (miRNAs) play an essential role in gene regulation, cellular metabolic, and pathological processes. Existing purely sequence based computational approaches lack robustness and efficiency mainly due to the high length variability of lncRNA sequences. Hence, the prime focus of the current study is to find optimal length trade-offs between highly flexible length lncRNA sequences. METHOD: The paper at hand performs in-depth exploration of diverse copy padding, sequence truncation approaches, and presents a novel idea of utilizing only subregions of lncRNA sequences to generate fixed-length lncRNA sequences. Furthermore, it presents a novel bag of tricks-based deep learning approach "Bot-Net" which leverages a single layer long-short-term memory network regularized through DropConnect to capture higher order residue dependencies, pooling to retain most salient features, normalization to prevent exploding and vanishing gradient issues, learning rate decay, and dropout to regularize precise neural network for lncRNA-miRNA interaction prediction. RESULTS: BoT-Net outperforms the state-of-the-art lncRNA-miRNA interaction prediction approach by 2%, 8%, and 4% in terms of accuracy, specificity, and matthews correlation coefficient. Furthermore, a case study analysis indicates that BoT-Net also outperforms state-of-the-art lncRNA-protein interaction predictor on a benchmark dataset by accuracy of 10%, sensitivity of 19%, specificity of 6%, precision of 14%, and matthews correlation coefficient of 26%. CONCLUSION: In the benchmark lncRNA-miRNA interaction prediction dataset, the length of the lncRNA sequence varies from 213 residues to 22,743 residues and in the benchmark lncRNA-protein interaction prediction dataset, lncRNA sequences vary from 15 residues to 1504 residues. For such highly flexible length sequences, fixed length generation using copy padding introduces a significant level of bias which makes a large number of lncRNA sequences very much identical to each other and eventually derail classifier generalizeability. Empirical evaluation reveals that within 50 residues of only the starting region of long lncRNA sequences, a highly informative distribution for lncRNA-miRNA interaction prediction is contained, a crucial finding exploited by the proposed BoT-Net approach to optimize the lncRNA fixed length generation process. AVAILABILITY: BoT-Net web server can be accessed at https://sds_genetic_analysis.opendfki.de/lncmiRNA/.


Assuntos
MicroRNAs , RNA Longo não Codificante , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Biologia Computacional , Redes Neurais de Computação , Regulação da Expressão Gênica
7.
mBio ; 13(3): e0089222, 2022 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-35532162

RESUMO

The coronavirus disease 2019, COVID-19, is a complex disease with a wide range of symptoms from asymptomatic infections to severe acute respiratory syndrome with lethal outcome. Individual factors such as age, sex, and comorbidities increase the risk for severe infections, but other aspects, such as genetic variations, are also likely to affect the susceptibility to SARS-CoV-2 infection and disease severity. Here, we used a human 3D lung cell model based on primary cells derived from multiple donors to identity host factors that regulate SARS-CoV-2 infection. With a transcriptomics-based approach, we found that less susceptible donors show a higher expression level of serine protease inhibitors SERPINA1, SERPINE1, and SERPINE2, identifying variation in cellular serpin levels as restricting host factors for SARS-CoV-2 infection. We pinpoint their antiviral mechanism of action to inhibition of the cellular serine protease, TMPRSS2, thereby preventing cleavage of the viral spike protein and TMPRSS2-mediated entry into the target cells. By means of single-cell RNA sequencing, we further locate the expression of the individual serpins to basal, ciliated, club, and goblet cells. Our results add to the importance of genetic variations as determinants for SARS-CoV-2 susceptibility and suggest that genetic deficiencies of cellular serpins might represent risk factors for severe COVID-19. Our study further highlights TMPRSS2 as a promising target for antiviral intervention and opens the door for the usage of locally administered serpins as a treatment against COVID-19. IMPORTANCE Identification of host factors affecting individual SARS-CoV-2 susceptibility will provide a better understanding of the large variations in disease severity and will identify potential factors that can be used, or targeted, in antiviral drug development. With the use of an advanced lung cell model established from several human donors, we identified cellular protease inhibitors, serpins, as host factors that restrict SARS-CoV-2 infection. The antiviral mechanism was found to be mediated by the inhibition of a serine protease, TMPRSS2, which results in a blockage of viral entry into target cells. Potential treatments with these serpins would not only reduce the overall viral burden in the patients, but also block the infection at an early time point, reducing the risk for the hyperactive immune response common in patients with severe COVID-19.


Assuntos
Antivirais , Tratamento Farmacológico da COVID-19 , Inibidores de Serina Proteinase , Serpinas , Antivirais/farmacologia , Humanos , Inibidor 1 de Ativador de Plasminogênio , SARS-CoV-2 , Serina Endopeptidases , Inibidores de Serina Proteinase/farmacologia , Serpina E2 , Serpinas/genética , Internalização do Vírus , alfa 1-Antitripsina
8.
Acta Paediatr ; 111(8): 1526-1535, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35397189

RESUMO

AIM: To assess the strength of associations between interrelated perinatal risk factors and mortality in very preterm infants. METHODS: Information on all live-born infants delivered in Sweden at 22-31 weeks of gestational age (GA) from 2011 to 2019 was gathered from the Swedish Neonatal Quality Register, excluding infants with major malformations or not resuscitated because of anticipated poor prognosis. Twenty-seven perinatal risk factors available at birth were exposures and in-hospital mortality outcome. Orthogonal partial least squares discriminant analysis was applied to assess proximity between individual risk factors and mortality, and receiver operating characteristic (ROC) curves were used to estimate discriminant ability. RESULTS: In total, 638 of 8,396 (7.6%) infants died. Thirteen risk factors discriminated reduced mortality; the most important were higher Apgar scores at 5 and 10 min, GA and birthweight. Restricting the analysis to preterm infants <28 weeks' GA (n = 2939, 16.9% mortality) added antenatal corticosteroid therapy as significantly associated with lower mortality. The area under the ROC curve (the C-statistic) using all risk factors was 0.86, as determined after both internal and external validation. CONCLUSION: Apgar scores, gestational age and birthweight show stronger associations with mortality in very preterm infants than several other perinatal risk factors available at birth.


Assuntos
Doenças do Prematuro , Recém-Nascido Prematuro , Peso ao Nascer , Análise Discriminante , Feminino , Retardo do Crescimento Fetal , Idade Gestacional , Humanos , Lactente , Mortalidade Infantil , Recém-Nascido , Mortalidade Perinatal , Gravidez , Fatores de Risco
9.
J Environ Manage ; 301: 113941, 2022 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-34731954

RESUMO

Understanding the mechanisms of pollutant removal in Wastewater Treatment Plants (WWTPs) is crucial for controlling effluent quality efficiently. However, the numerous treatment units, operational factors, and the underlying interactions between these units and factors usually obfuscate the comprehensive and precise understanding of the processes. We have previously proposed a machine learning (ML) framework to uncover complex cause-and-effect relationships in WWTPs. However, only one interpretable ML model, Random forest (RF), was studied and the interpretation method was not granular enough to reveal very detailed relationships between operational factors and effluent parameters. Thus, in this paper, we present an upgraded framework involving three interpretable tree-based models (RF, XGboost and LightGBM), three metrics (R2, Root mean squared error (RMSE), and Mean absolute error (MAE)) and a more advanced interpretation system SHapley Additive exPlanations (SHAP). Details of the framework are provided along with a demonstration of its practical applicability based on a case study of the Umeå WWTP in Sweden. Results show that, for both labels TSSe (Total suspended solids in effluent) and PO4e (Phosphate in effluent), the XGBoost models are optimal whereas the RF models are the least optimal, due to overfitting and polarized fitting. This study has yielded multiple new and significant findings with respect to the control of TSSe and PO4e in the Umeå WWTP and other similarly configured WWTPs. Additionally, this study has produced two important generic findings relating to ML applications for WWTPs (or even other process industries) in terms of cause-and-effect investigations. First, the model comparison should be carried out from multiple perspectives to ensure that underlying details are fully revealed and examined. Second, using a precise, robust, and granular (feature attribution available for individual instances) explanation method can bring extra insight into both model comparison and model interpretation. SHAP is recommended as we found it to be of great value in this study.


Assuntos
Aprendizado de Máquina , Purificação da Água , Suécia
10.
Nat Methods ; 18(9): 1038-1045, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34462594

RESUMO

Light microscopy combined with well-established protocols of two-dimensional cell culture facilitates high-throughput quantitative imaging to study biological phenomena. Accurate segmentation of individual cells in images enables exploration of complex biological questions, but can require sophisticated imaging processing pipelines in cases of low contrast and high object density. Deep learning-based methods are considered state-of-the-art for image segmentation but typically require vast amounts of annotated data, for which there is no suitable resource available in the field of label-free cellular imaging. Here, we present LIVECell, a large, high-quality, manually annotated and expert-validated dataset of phase-contrast images, consisting of over 1.6 million cells from a diverse set of cell morphologies and culture densities. To further demonstrate its use, we train convolutional neural network-based models using LIVECell and evaluate model segmentation accuracy with a proposed a suite of benchmarks.


Assuntos
Bases de Dados Factuais , Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodos , Modelos Biológicos , Técnicas de Cultura de Células , Humanos , Redes Neurais de Computação
11.
Nat Protoc ; 16(9): 4299-4326, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34321638

RESUMO

Metabolic phenotyping is an important tool in translational biomedical research. The advanced analytical technologies commonly used for phenotyping, including mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy, generate complex data requiring tailored statistical analysis methods. Detailed protocols have been published for data acquisition by liquid NMR, solid-state NMR, ultra-performance liquid chromatography (LC-)MS and gas chromatography (GC-)MS on biofluids or tissues and their preprocessing. Here we propose an efficient protocol (guidelines and software) for statistical analysis of metabolic data generated by these methods. Code for all steps is provided, and no prior coding skill is necessary. We offer efficient solutions for the different steps required within the complete phenotyping data analytics workflow: scaling, normalization, outlier detection, multivariate analysis to explore and model study-related effects, selection of candidate biomarkers, validation, multiple testing correction and performance evaluation of statistical models. We also provide a statistical power calculation algorithm and safeguards to ensure robust and meaningful experimental designs that deliver reliable results. We exemplify the protocol with a two-group classification study and data from an epidemiological cohort; however, the protocol can be easily modified to cover a wider range of experimental designs or incorporate different modeling approaches. This protocol describes a minimal set of analyses needed to rigorously investigate typical datasets encountered in metabolic phenotyping.


Assuntos
Técnicas Genéticas , Metabolômica/métodos , Fenótipo , Software , Estatística como Assunto , Humanos , Metabolismo
12.
Sci Total Environ ; 784: 147138, 2021 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-34088065

RESUMO

Due to the intrinsic complexity of wastewater treatment plant (WWTP) processes, it is always challenging to respond promptly and appropriately to the dynamic process conditions in order to ensure the quality of the effluent, especially when operational cost is a major concern. Machine Learning (ML) methods have therefore been used to model WWTP processes in order to avoid various shortcomings of conventional mechanistic models. However, to the best of the authors' knowledge, no ML applications have focused on investigating how operational factors can affect effluent quality. Additionally, the time lags between process steps have always been neglected, making it difficult to explain the relationships between operational factors and effluent quality. Therefore, this paper presents a novel ML-based framework designed to improve effluent quality control in WWTPs by clarifying the relationships between operational variables and effluent parameters. The framework consists of Random Forest (RF) models, Deep Neural Network (DNN) models, Variable Importance Measure (VIM) analyses, and Partial Dependence Plot (PDP) analyses, and uses a novel approach to account for the impact of time lags between processes. Details of the framework are provided along with a demonstration of its practical applicability based on a case study of the Umeå WWTP in Sweden involving a large number of samples (105763) representing the full scale of the plant's operations. Two effluent parameters, Total Suspended Solids in effluent (TSSe) and Phosphate in effluent (PO4e), and thirty-two operational variables are studied. RF models are developed, validated using DNN models as references, and shown to be suitable for VIM and PDP analyses. VIM identifies the variables that most strongly influence TSSe and PO4e, while PDP elucidates their specific effects on TSSe and PO4e. The major findings are: (1) Influent temperature is the most influential variable for both TSSe and PO4e, but it affects them in different ways; (2) PO4e depends strongly on the TSS in aeration basins - higher TSS concentrations in aeration basins generally promote PO4 removal, but excess TSS can have negative effects; (3) In general, the impact of TSS in aeration basins on TSSe and PO4e increases with the distances of the basin from the merging outlet, so more attention should be paid to the TSS concentration in the third or fourth aeration basins than the first and second ones; (4) Returning excessive amounts of sludge through the second return sludge pipe should be avoided because of its adverse impact on TSSe removal. These results could support the development of more advanced control strategies to increase control precision and reduce running costs in the Umeå WWTP and other similarly configured WWTPs. The framework could also be applied to other parameters in WWTPs and industrial processes in general if sufficient high-resolution data are available.

13.
BMC Bioinformatics ; 22(1): 176, 2021 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-33812384

RESUMO

BACKGROUND: For multivariate data analysis involving only two input matrices (e.g., X and Y), the previously published methods for variable influence on projection (e.g., VIPOPLS or VIPO2PLS) are widely used for variable selection purposes, including (i) variable importance assessment, (ii) dimensionality reduction of big data and (iii) interpretation enhancement of PLS, OPLS and O2PLS models. For multiblock analysis, the OnPLS models find relationships among multiple data matrices (more than two blocks) by calculating latent variables; however, a method for improving the interpretation of these latent variables (model components) by assessing the importance of the input variables was not available up to now. RESULTS: A method for variable selection in multiblock analysis, called multiblock variable influence on orthogonal projections (MB-VIOP) is explained in this paper. MB-VIOP is a model based variable selection method that uses the data matrices, the scores and the normalized loadings of an OnPLS model in order to sort the input variables of more than two data matrices according to their importance for both simplification and interpretation of the total multiblock model, and also of the unique, local and global model components separately. MB-VIOP has been tested using three datasets: a synthetic four-block dataset, a real three-block omics dataset related to plant sciences, and a real six-block dataset related to the food industry. CONCLUSIONS: We provide evidence for the usefulness and reliability of MB-VIOP by means of three examples (one synthetic and two real-world cases). MB-VIOP assesses in a trustable and efficient way the importance of both isolated and ranges of variables in any type of data. MB-VIOP connects the input variables of different data matrices according to their relevance for the interpretation of each latent variable, yielding enhanced interpretability for each OnPLS model component. Besides, MB-VIOP can deal with strong overlapping of types of variation, as well as with many data blocks with very different dimensionality. The ability of MB-VIOP for generating dimensionality reduced models with high interpretability makes this method ideal for big data mining, multi-omics data integration and any study that requires exploration and interpretation of large streams of data.


Assuntos
Análise de Dados , Mineração de Dados , Análise Multivariada , Reprodutibilidade dos Testes
14.
SLAS Technol ; 26(4): 408-414, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33874798

RESUMO

Machine vision is a powerful technology that has become increasingly popular and accurate during the last decade due to rapid advances in the field of machine learning. The majority of machine vision applications are currently found in consumer electronics, automotive applications, and quality control, yet the potential for bioprocessing applications is tremendous. For instance, detecting and controlling foam emergence is important for all upstream bioprocesses, but the lack of robust foam sensing often leads to batch failures from foam-outs or overaddition of antifoam agents. Here, we report a new low-cost, flexible, and reliable foam sensor concept for bioreactor applications. The concept applies convolutional neural networks (CNNs), a state-of-the-art machine learning system for image processing. The implemented method shows high accuracy for both binary foam detection (foam/no foam) and fine-grained classification of foam levels.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Algoritmos , Reatores Biológicos , Processamento de Imagem Assistida por Computador
15.
Biotechnol Biofuels ; 14(1): 43, 2021 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-33593413

RESUMO

BACKGROUND: Bioconversion of wood into bioproducts and biofuels is hindered by the recalcitrance of woody raw material to bioprocesses such as enzymatic saccharification. Targeted modification of the chemical composition of the feedstock can improve saccharification but this gain is often abrogated by concomitant reduction in tree growth. RESULTS: In this study, we report on transgenic hybrid aspen (Populus tremula × tremuloides) lines that showed potential to increase biomass production both in the greenhouse and after 5 years of growth in the field. The transgenic lines carried an overexpression construct for Populus tremula × tremuloides vesicle-associated membrane protein (VAMP)-associated protein PttVAP27-17 that was selected from a gene-mining program for novel regulators of wood formation. Analytical-scale enzymatic saccharification without any pretreatment revealed for all greenhouse-grown transgenic lines, compared to the wild type, a 20-44% increase in the glucose yield per dry weight after enzymatic saccharification, even though it was statistically significant only for one line. The glucose yield after enzymatic saccharification with a prior hydrothermal pretreatment step with sulfuric acid was not increased in the greenhouse-grown transgenic trees on a dry-weight basis, but increased by 26-50% when calculated on a whole biomass basis in comparison to the wild-type control. Tendencies to increased glucose yields by up to 24% were present on a whole tree biomass basis after acidic pretreatment and enzymatic saccharification also in the transgenic trees grown for 5 years on the field when compared to the wild-type control. CONCLUSIONS: The results demonstrate the usefulness of gene-mining programs to identify novel genes with the potential to improve biofuel production in tree biotechnology programs. Furthermore, multi-omic analyses, including transcriptomic, proteomic and metabolomic analyses, performed here provide a toolbox for future studies on the function of VAP27 proteins in plants.

16.
Genes (Basel) ; 11(12)2020 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-33316943

RESUMO

MicroRNAs (miRNA) are small noncoding RNA sequences consisting of about 22 nucleotides that are involved in the regulation of almost 60% of mammalian genes. Presently, there are very limited approaches for the visualization of miRNA locations present inside cells to support the elucidation of pathways and mechanisms behind miRNA function, transport, and biogenesis. MIRLocator, a state-of-the-art tool for the prediction of subcellular localization of miRNAs makes use of a sequence-to-sequence model along with pretrained k-mer embeddings. Existing pretrained k-mer embedding generation methodologies focus on the extraction of semantics of k-mers. However, in RNA sequences, positional information of nucleotides is more important because distinct positions of the four nucleotides define the function of an RNA molecule. Considering the importance of the nucleotide position, we propose a novel approach (kmerPR2vec) which is a fusion of positional information of k-mers with randomly initialized neural k-mer embeddings. In contrast to existing k-mer-based representation, the proposed kmerPR2vec representation is much more rich in terms of semantic information and has more discriminative power. Using novel kmerPR2vec representation, we further present an end-to-end system (MirLocPredictor) which couples the discriminative power of kmerPR2vec with Convolutional Neural Networks (CNNs) for miRNA subcellular location prediction. The effectiveness of the proposed kmerPR2vec approach is evaluated with deep learning-based topologies (i.e., Convolutional Neural Networks (CNN) and Recurrent Neural Network (RNN)) and by using 9 different evaluation measures. Analysis of the results reveals that MirLocPredictor outperform state-of-the-art methods with a significant margin of 18% and 19% in terms of precision and recall.


Assuntos
MicroRNAs/análise , MicroRNAs/genética , Mapeamento de Nucleotídeos/métodos , Algoritmos , Animais , Biologia Computacional/métodos , Aprendizado Profundo , Previsões/métodos , Humanos , Espaço Intracelular/genética , Redes Neurais de Computação , Análise de Sequência de RNA/métodos
17.
PLoS One ; 15(9): e0237721, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32915809

RESUMO

The number of national reference populations that are whole-genome sequenced are rapidly increasing. Partly driving this development is the fact that genetic disease studies benefit from knowing the genetic variation typical for the geographical area of interest. A whole-genome sequenced Swedish national reference population (n = 1000) has been recently published but with few samples from northern Sweden. In the present study we have whole-genome sequenced a control population (n = 300) (ACpop) from Västerbotten County, a sparsely populated region in northern Sweden previously shown to be genetically different from southern Sweden. The aggregated variant frequencies within ACpop are publicly available (DOI 10.17044/NBIS/G000005) to function as a basic resource in clinical genetics and for genetic studies. Our analysis of ACpop, representing approximately 0.11% of the population in Västerbotten, indicates the presence of a genetic substructure within the county. Furthermore, a demographic analysis showed that the population from which samples were drawn was to a large extent geographically stationary, a finding that was corroborated in the genetic analysis down to the level of municipalities. Including ACpop in the reference population when imputing unknown variants in a Västerbotten cohort resulted in a strong increase in the number of high-confidence imputed variants (up to 81% for variants with minor allele frequency < 5%). ACpop was initially designed for cancer disease studies, but the genetic structure within the cohort will be of general interest for all genetic disease studies in northern Sweden.


Assuntos
Genoma Humano , Polimorfismo Genético , População/genética , Idoso , Idoso de 80 Anos ou mais , Bases de Dados Genéticas , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Suécia , Sequenciamento Completo do Genoma
18.
Vaccines (Basel) ; 8(3)2020 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-32679889

RESUMO

The expression of Vitis vinifera polygalacturonase inhibiting protein 1 (VviPGIP1) in Nicotiana tabacum has been linked to modifications at the cell wall level. Previous investigations have shown an upregulation of the lignin biosynthesis pathway and reorganisation of arabinoxyloglucan composition. This suggests cell wall tightening occurs, which may be linked to defence priming responses. The present study used a screening approach to test four VviPGIP1 and four NtCAD14 overexpressing transgenic lines for cell wall alterations. Overexpressing the tobacco-derived cinnamyl alcohol dehydrogenase (NtCAD14) gene is known to increase lignin biosynthesis and deposition. These lines, particularly PGIP1 expressing plants, have been shown to lead to a decrease in susceptibility towards grey rot fungus Botrytis cinerea. In this study the aim was to investigate the cell wall modulations that occurred prior to infection, which should highlight potential priming phenomena and phenotypes. Leaf lignin composition and relative concentration of constituent monolignols were evaluated using pyrolysis gas chromatography. Significant concentrations of lignin were deposited in the stems but not the leaves of NtCAD14 overexpressing plants. Furthermore, no significant changes in monolignol composition were found between transgenic and wild type plants. The polysaccharide modifications were quantified using gas chromatography (GC-MS) of constituent monosaccharides. The major leaf polysaccharide and cell wall protein components were evaluated using comprehensive microarray polymer profiling (CoMPP). The most significant changes appeared at the polysaccharide and protein level. The pectin fraction of the transgenic lines had subtle variations in patterning for methylesterification epitopes for both VviPGIP1 and NtCAD14 transgenic lines versus wild type. Pectin esterification levels have been linked to pathogen defence in the past. The most marked changes occurred in glycoprotein abundance for both the VviPGIP1 and NtCAD14 lines. Epitopes for arabinogalactan proteins (AGPs) and extensins were notably altered in transgenic NtCAD14 tobacco.

19.
Metabolites ; 10(7)2020 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-32709053

RESUMO

Data integration has been proven to provide valuable information. The information extracted using data integration in the form of multiblock analysis can pinpoint both common and unique trends in the different blocks. When working with small multiblock datasets the number of possible integration methods is drastically reduced. To investigate the application of multiblock analysis in cases where one has a few number of samples and a lack of statistical power, we studied a small metabolomic multiblock dataset containing six blocks (i.e., tissue types), only including common metabolites. We used a single model multiblock analysis method called the joint and unique multiblock analysis (JUMBA) and compared it to a commonly used method, concatenated principal component analysis (PCA). These methods were used to detect trends in the dataset and identify underlying factors responsible for metabolic variations. Using JUMBA, we were able to interpret the extracted components and link them to relevant biological properties. JUMBA shows how the observations are related to one another, the stability of these relationships, and to what extent each of the blocks contribute to the components. These results indicate that multiblock methods can be useful even with a small number of samples.

20.
Methods Mol Biol ; 2149: 327-337, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32617943

RESUMO

Plant cell walls are composed of a number of coextensive polysaccharide-rich networks (i.e., pectin, hemicellulose, protein). Polysaccharide-rich cell walls are important in a number of biological processes including fruit ripening, plant-pathogen interactions (e.g., pathogenic fungi), fermentations (e.g., winemaking), and tissue differentiation (e.g., secondary cell walls). Applying appropriate methods is necessary to assess biological roles as for example in putative plant gene functional characterization (e.g., experimental evaluation of transgenic plants). Obtaining datasets is relatively easy, using for example gas chromatography-mass spectrometry (GC-MS) methods for monosaccharide composition, Fourier transform infrared spectroscopy (FT-IR) and comprehensive microarray polymer profiling (CoMPP); however, analyzing the data requires implementing statistical tools for large-scale datasets. We have validated and implemented a range of multivariate data analysis methods on datasets from tobacco, grapevine, and wine polysaccharide studies. Here we present the workflow from processing samples to acquiring data to performing data analysis (particularly principal component analysis (PCA) and orthogonal projection to latent structure (OPLS) methods).


Assuntos
Parede Celular/química , Células Vegetais/química , Biopolímeros/análise , Análise dos Mínimos Quadrados , Análise Multivariada , Análise de Componente Principal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA