Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.094
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Immunity ; 51(4): 696-708.e9, 2019 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-31618654

RESUMO

Signaling abnormalities in immune responses in the small intestine can trigger chronic type 2 inflammation involving interaction of multiple immune cell types. To systematically characterize this response, we analyzed 58,067 immune cells from the mouse small intestine by single-cell RNA sequencing (scRNA-seq) at steady state and after induction of a type 2 inflammatory reaction to ovalbumin (OVA). Computational analysis revealed broad shifts in both cell-type composition and cell programs in response to the inflammation, especially in group 2 innate lymphoid cells (ILC2s). Inflammation induced the expression of exon 5 of Calca, which encodes the alpha-calcitonin gene-related peptide (α-CGRP), in intestinal KLRG1+ ILC2s. α-CGRP antagonized KLRG1+ ILC2s proliferation but promoted IL-5 expression. Genetic perturbation of α-CGRP increased the proportion of intestinal KLRG1+ ILC2s. Our work highlights a model where α-CGRP-mediated neuronal signaling is critical for suppressing ILC2 expansion and maintaining homeostasis of the type 2 immune machinery.


Assuntos
Peptídeo Relacionado com Gene de Calcitonina/metabolismo , Inflamação/imunologia , Intestinos/imunologia , Linfócitos/imunologia , Neuropeptídeos/metabolismo , Animais , Peptídeo Relacionado com Gene de Calcitonina/genética , Células Cultivadas , Biologia Computacional , Imunidade Inata , Interleucina-5/genética , Interleucina-5/metabolismo , Lectinas Tipo C/metabolismo , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Transgênicos , Neuropeptídeos/genética , Receptores Imunológicos/metabolismo , Análise de Sequência de RNA , Transdução de Sinais , Análise de Célula Única , Células Th2/imunologia , Transcriptoma , Regulação para Cima
2.
Proc Natl Acad Sci U S A ; 121(9): e2316301121, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38377198

RESUMO

Modern deep networks are trained with stochastic gradient descent (SGD) whose key hyperparameters are the number of data considered at each step or batch size [Formula: see text], and the step size or learning rate [Formula: see text]. For small [Formula: see text] and large [Formula: see text], SGD corresponds to a stochastic evolution of the parameters, whose noise amplitude is governed by the "temperature" [Formula: see text]. Yet this description is observed to break down for sufficiently large batches [Formula: see text], or simplifies to gradient descent (GD) when the temperature is sufficiently small. Understanding where these cross-overs take place remains a central challenge. Here, we resolve these questions for a teacher-student perceptron classification model and show empirically that our key predictions still apply to deep networks. Specifically, we obtain a phase diagram in the [Formula: see text]-[Formula: see text] plane that separates three dynamical phases: i) a noise-dominated SGD governed by temperature, ii) a large-first-step-dominated SGD and iii) GD. These different phases also correspond to different regimes of generalization error. Remarkably, our analysis reveals that the batch size [Formula: see text] separating regimes (i) and (ii) scale with the size [Formula: see text] of the training set, with an exponent that characterizes the hardness of the classification problem.

3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38271483

RESUMO

The advent of single-cell sequencing technologies has revolutionized cell biology studies. However, integrative analyses of diverse single-cell data face serious challenges, including technological noise, sample heterogeneity, and different modalities and species. To address these problems, we propose scCorrector, a variational autoencoder-based model that can integrate single-cell data from different studies and map them into a common space. Specifically, we designed a Study Specific Adaptive Normalization for each study in decoder to implement these features. scCorrector substantially achieves competitive and robust performance compared with state-of-the-art methods and brings novel insights under various circumstances (e.g. various batches, multi-omics, cross-species, and development stages). In addition, the integration of single-cell data and spatial data makes it possible to transfer information between different studies, which greatly expand the narrow range of genes covered by MERFISH technology. In summary, scCorrector can efficiently integrate multi-study single-cell datasets, thereby providing broad opportunities to tackle challenges emerging from noisy resources.

4.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37497716

RESUMO

Cytometry enables precise single-cell phenotyping within heterogeneous populations. These cell types are traditionally annotated via manual gating, but this method lacks reproducibility and sensitivity to batch effect. Also, the most recent cytometers-spectral flow or mass cytometers-create rich and high-dimensional data whose analysis via manual gating becomes challenging and time-consuming. To tackle these limitations, we introduce Scyan https://github.com/MICS-Lab/scyan, a Single-cell Cytometry Annotation Network that automatically annotates cell types using only prior expert knowledge about the cytometry panel. For this, it uses a normalizing flow-a type of deep generative model-that maps protein expressions into a biologically relevant latent space. We demonstrate that Scyan significantly outperforms the related state-of-the-art models on multiple public datasets while being faster and interpretable. In addition, Scyan overcomes several complementary tasks, such as batch-effect correction, debarcoding and population discovery. Overall, this model accelerates and eases cell population characterization, quantification and discovery in cytometry.


Assuntos
Biologia , Reprodutibilidade dos Testes , Citometria de Fluxo/métodos
5.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-37991248

RESUMO

Due to the high dimensionality and sparsity of the gene expression matrix in single-cell RNA-sequencing (scRNA-seq) data, coupled with significant noise generated by shallow sequencing, it poses a great challenge for cell clustering methods. While numerous computational methods have been proposed, the majority of existing approaches center on processing the target dataset itself. This approach disregards the wealth of knowledge present within other species and batches of scRNA-seq data. In light of this, our paper proposes a novel method named graph-based deep embedding clustering (GDEC) that leverages transfer learning across species and batches. GDEC integrates graph convolutional networks, effectively overcoming the challenges posed by sparse gene expression matrices. Additionally, the incorporation of DEC in GDEC enables the partitioning of cell clusters within a lower-dimensional space, thereby mitigating the adverse effects of noise on clustering outcomes. GDEC constructs a model based on existing scRNA-seq datasets and then applying transfer learning techniques to fine-tune the model using a limited amount of prior knowledge gleaned from the target dataset. This empowers GDEC to adeptly cluster scRNA-seq data cross different species and batches. Through cross-species and cross-batch clustering experiments, we conducted a comparative analysis between GDEC and conventional packages. Furthermore, we implemented GDEC on the scRNA-seq data of uterine fibroids. Compared results obtained from the Seurat package, GDEC unveiled a novel cell type (epithelial cells) and identified a notable number of new pathways among various cell types, thus underscoring the enhanced analytical capabilities of GDEC. Availability and implementation: https://github.com/YuzhiSun/GDEC/tree/main.


Assuntos
Perfilação da Expressão Gênica , Leiomioma , Humanos , Perfilação da Expressão Gênica/métodos , Algoritmos , Análise de Sequência de RNA/métodos , Análise da Expressão Gênica de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Aprendizado de Máquina
6.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36627114

RESUMO

Dimension reduction (DR) plays an important role in single-cell RNA sequencing (scRNA-seq), such as data interpretation, visualization and other downstream analysis. A desired DR method should be applicable to various application scenarios, including identifying cell types, preserving the inherent structure of data and handling with batch effects. However, most of the existing DR methods fail to accommodate these requirements simultaneously, especially removing batch effects. In this paper, we develop a novel structure-preserved dimension reduction (SPDR) method using intra- and inter-batch triplets sampling. The constructed triplets jointly consider each anchor's mutual nearest neighbors from inter-batch, k-nearest neighbors from intra-batch and randomly selected cells from the whole data, which capture higher order structure information and meanwhile account for batch information of the data. Then we minimize a robust loss function for the chosen triplets to obtain a structure-preserved and batch-corrected low-dimensional representation. Comprehensive evaluations show that SPDR outperforms other competing DR methods, such as INSCT, IVIS, Trimap, Scanorama, scVI and UMAP, in removing batch effects, preserving biological variation, facilitating visualization and improving clustering accuracy. Besides, the two-dimensional (2D) embedding of SPDR presents a clear and authentic expression pattern, and can guide researchers to determine how many cell types should be identified. Furthermore, SPDR is robust to complex data characteristics (such as down-sampling, duplicates and outliers) and varying hyperparameter settings. We believe that SPDR will be a valuable tool for characterizing complex cellular heterogeneity.


Assuntos
Algoritmos , Transcriptoma , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados , Análise de Sequência de RNA/métodos
7.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36653900

RESUMO

Microbial communities are highly dynamic and sensitive to changes in the environment. Thus, microbiome data are highly susceptible to batch effects, defined as sources of unwanted variation that are not related to and obscure any factors of interest. Existing batch effect correction methods have been primarily developed for gene expression data. As such, they do not consider the inherent characteristics of microbiome data, including zero inflation, overdispersion and correlation between variables. We introduce new multivariate and non-parametric batch effect correction methods based on Partial Least Squares Discriminant Analysis (PLSDA). PLSDA-batch first estimates treatment and batch variation with latent components, then subtracts batch-associated components from the data. The resulting batch-effect-corrected data can then be input in any downstream statistical analysis. Two variants are proposed to handle unbalanced batch x treatment designs and to avoid overfitting when estimating the components via variable selection. We compare our approaches with popular methods managing batch effects, namely, removeBatchEffect, ComBat and Surrogate Variable Analysis, in simulated and three case studies using various visual and numerical assessments. We show that our three methods lead to competitive performance in removing batch variation while preserving treatment variation, especially for unbalanced batch $\times $ treatment designs. Our downstream analyses show selections of biologically relevant taxa. This work demonstrates that batch effect correction methods can improve microbiome research outputs. Reproducible code and vignettes are available on GitHub.


Assuntos
Microbiota , Projetos de Pesquisa , Análise dos Mínimos Quadrados , Análise Discriminante
8.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37080771

RESUMO

Single-cell RNA sequencing (scRNA-seq) has significantly accelerated the experimental characterization of distinct cell lineages and types in complex tissues and organisms. Cell-type annotation is of great importance in most of the scRNA-seq analysis pipelines. However, manual cell-type annotation heavily relies on the quality of scRNA-seq data and marker genes, and therefore can be laborious and time-consuming. Furthermore, the heterogeneity of scRNA-seq datasets poses another challenge for accurate cell-type annotation, such as the batch effect induced by different scRNA-seq protocols and samples. To overcome these limitations, here we propose a novel pipeline, termed TripletCell, for cross-species, cross-protocol and cross-sample cell-type annotation. We developed a cell embedding and dimension-reduction module for the feature extraction (FE) in TripletCell, namely TripletCell-FE, to leverage the deep metric learning-based algorithm for the relationships between the reference gene expression matrix and the query cells. Our experimental studies on 21 datasets (covering nine scRNA-seq protocols, two species and three tissues) demonstrate that TripletCell outperformed state-of-the-art approaches for cell-type annotation. More importantly, regardless of protocols or species, TripletCell can deliver outstanding and robust performance in annotating different types of cells. TripletCell is freely available at https://github.com/liuyan3056/TripletCell. We believe that TripletCell is a reliable computational tool for accurately annotating various cell types using scRNA-seq data and will be instrumental in assisting the generation of novel biological hypotheses in cell biology.


Assuntos
Algoritmos , Análise de Célula Única , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados
9.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38171928

RESUMO

Recent advances in spatial transcriptomics (ST) have enabled comprehensive profiling of gene expression with spatial information in the context of the tissue microenvironment. However, with the improvements in the resolution and scale of ST data, deciphering spatial domains precisely while ensuring efficiency and scalability is still challenging. Here, we develop SGCAST, an efficient auto-encoder framework to identify spatial domains. SGCAST adopts a symmetric graph convolutional auto-encoder to learn aggregated latent embeddings via integrating the gene expression similarity and the proximity of the spatial spots. This framework in SGCAST enables a mini-batch training strategy, which makes SGCAST memory-efficient and scalable to high-resolution spatial transcriptomic data with a large number of spots. SGCAST improves the overall accuracy of spatial domain identification on benchmarking data. We also validated the performance of SGCAST on ST datasets at various scales across multiple platforms. Our study illustrates the superior capacity of SGCAST on analyzing spatial transcriptomic data.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Benchmarking , Aprendizagem
10.
Bioinformatics ; 2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39226186

RESUMO

MOTIVATION: Systems biology analyses often use correlations in gene expression profiles to infer co-expression networks that are then used as input for gene regulatory network inference or to identify functional modules of co-expressed or putatively co-regulated genes. While systematic biases, including batch effects, are known to induce spurious associations and confound differential gene expression analyses (DE), the impact of batch effects on gene co-expression has not been fully explored. Methods have been developed to adjust expression values, ensuring conditional independence of mean and variance from batch or other covariates for each gene, resulting in improved fidelity of DE analysis. However, such adjustments do not address the potential for spurious differential co-expression (DC) between groups. Consequently, uncorrected, artifactual DC can skew the correlation structure, leading to the identification of false, non-biological associations, even when the input data is corrected using standard batch correction. RESULTS: In this work, we demonstrate the persistence of confounders in covariance after standard batch correction using synthetic and real-world gene expression data examples. We then introduce Co-expression Batch Reduction Adjustment (COBRA), a method for computing a batch-corrected gene co-expression matrix based on estimating a conditional covariance matrix. COBRA estimates a reduced set of parameters expressing the co-expression matrix as a function of the sample covariates, allowing control for continuous and categorical covariates. COBRA is computationally efficient, leveraging the inherently modular structure of genomic data to estimate accurate gene regulatory associations and facilitate functional analysis for high-dimensional genomic data. AVAILABILITY AND IMPLEMENTATION: COBRA is available under the GLP3 open source license in R and Python in netZoo (https://netzoo.github.io). SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.

11.
J Infect Dis ; 2024 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-39189314

RESUMO

As investigations of low-biomass microbial communities have become more common, so too has the recognition of major challenges affecting these analyses. These challenges have been shown to compromise biological conclusions and have contributed to several controversies. Here, we review some of the most common and influential challenges in low-biomass microbiome research. We highlight key approaches to alleviate these potential pitfalls, combining experimental planning strategies and data analysis methods.

12.
Proteomics ; 24(3-4): e2200424, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37750450

RESUMO

Fractionation of proteoforms is currently the most challenging topic in the field of proteoform analysis. The need for considering the existence of proteoforms in experimental approaches is not only important in Life Science research in general but especially in the manufacturing of therapeutic proteins (TPs) like recombinant therapeutic antibodies (mAbs). Some of the proteoforms of TPs have significantly decreased actions or even cause side effects. The identification and removal of proteoforms differing from the main species, having the desired action, is challenging because the difference in the composition of atoms is often very small and their concentration in comparison to the main proteoform can be low. In this study, we demonstrate that sample displacement batch chromatography (SDBC) is an easy-to-handle, economical, and efficient method for fractionating proteoforms. As a model sample a commercial ovalbumin fraction was used, containing many ovalbumin proteoforms. The most promising parameters for the SDBC were determined by a screening approach and applied for a 10-segment fractionation of ovalbumin with cation exchange chromatography resins. Mass spectrometry of intact proteoforms was used for characterizing the SDBC fractionation process. By SDBC, a significant separation of different proteoforms was obtained.


Assuntos
Processamento de Proteína Pós-Traducional , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Ovalbumina/metabolismo , Cromatografia , Proteoma/análise
13.
BMC Bioinformatics ; 25(1): 60, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38321388

RESUMO

BACKGROUND: As a gold-standard quantitative technique based on mass spectrometry, multiple reaction monitoring (MRM) has been widely used in proteomics and metabolomics. In the analysis of MRM data, as no peak picking algorithm can achieve perfect accuracy, manual inspection is necessary to correct the errors. In large cohort analysis scenarios, the time required for manual inspection is often considerable. Apart from the commercial software that comes with mass spectrometers, the open-source and free software Skyline is the most popular software for quantitative omics. However, this software is not optimized for manual inspection of hundreds of samples, the interactive experience also needs to be improved. RESULTS: Here we introduce MRMPro, a web-based MRM data analysis platform for efficient manual inspection. MRMPro supports data analysis of MRM and schedule MRM data acquired by mass spectrometers of mainstream vendors. With the goal of improving the speed of manual inspection, we implemented a collaborative review system based on cloud architecture, allowing multiple users to review through browsers. To reduce bandwidth usage and improve data retrieval speed, we proposed a MRM data compression algorithm, which reduced data volume by more than 60% and 80% respectively compared to vendor and mzML format. To improve the efficiency of manual inspection, we proposed a retention time drift estimation algorithm based on similarity of chromatograms. The estimated retention time drifts were then used for peak alignment and automatic EIC grouping. Compared with Skyline, MRMPro has higher quantification accuracy and better manual inspection support. CONCLUSIONS: In this study, we proposed MRMPro to improve the usability of manual calibration for MRM data analysis. MRMPro is free for non-commercial use. Researchers can access MRMPro through http://mrmpro.csibio.com/ . All major mass spectrometry formats (wiff, raw, mzML, etc.) can be analyzed on the platform. The final identification results can be exported to a common.xlsx format for subsequent analysis.


Assuntos
Algoritmos , Compressão de Dados , Humanos , Calibragem , Espectrometria de Massas/métodos , Software , Internet
14.
BMC Bioinformatics ; 25(1): 181, 2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38720247

RESUMO

BACKGROUND: RNA sequencing combined with machine learning techniques has provided a modern approach to the molecular classification of cancer. Class predictors, reflecting the disease class, can be constructed for known tissue types using the gene expression measurements extracted from cancer patients. One challenge of current cancer predictors is that they often have suboptimal performance estimates when integrating molecular datasets generated from different labs. Often, the quality of the data is variable, procured differently, and contains unwanted noise hampering the ability of a predictive model to extract useful information. Data preprocessing methods can be applied in attempts to reduce these systematic variations and harmonize the datasets before they are used to build a machine learning model for resolving tissue of origins. RESULTS: We aimed to investigate the impact of data preprocessing steps-focusing on normalization, batch effect correction, and data scaling-through trial and comparison. Our goal was to improve the cross-study predictions of tissue of origin for common cancers on large-scale RNA-Seq datasets derived from thousands of patients and over a dozen tumor types. The results showed that the choice of data preprocessing operations affected the performance of the associated classifier models constructed for tissue of origin predictions in cancer. CONCLUSION: By using TCGA as a training set and applying data preprocessing methods, we demonstrated that batch effect correction improved performance measured by weighted F1-score in resolving tissue of origin against an independent GTEx test dataset. On the other hand, the use of data preprocessing operations worsened classification performance when the independent test dataset was aggregated from separate studies in ICGC and GEO. Therefore, based on our findings with these publicly available large-scale RNA-Seq datasets, the application of data preprocessing techniques to a machine learning pipeline is not always appropriate.


Assuntos
Aprendizado de Máquina , Neoplasias , RNA-Seq , Humanos , RNA-Seq/métodos , Neoplasias/genética , Transcriptoma/genética , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos
15.
J Cell Biochem ; 125(1): 59-78, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38047468

RESUMO

The study aimed to evaluate the antioxidant, protein kinase inhibitory (PKIs) potential, cytotoxicity activity of Streptomyces clavuligerus extract. DPPH assay revealed a robust free radical scavenging capacity (IC50 28.90 ± 0.24 µg/mL) of organic extract with a maximum inhibition percentage of 61 ± 1.04%. PKIs assay revealed the formation of a whitish bald zone by S. clavuligerus extracts which indicates the presence of PKIs. The cytotoxicity activity of organic fraction of extract through Sulforhodamine B assay on MCF-7, Hop-62, SiHa, and PC-3 cell lines demonstrated the lowest GI50 value against the MCF-7 cell line followed by the PC-3 cell line, showing potent growth inhibitory potential against human breast cancer and human prostate cancer cell line. HR-LCMS analysis identified multiple secondary metabolites from the organic and aqueous extracts of S. clavuligerus when incubated at 30°C under 200 rpm for 3 days. All the secondary metabolites were elucidated for their potential to inhibit RTKs by molecular docking, molecular dynamic simulation, MM/GBSA calculations, and free energy approach. It revealed the superior inhibitory potential of epirubicin (Epi) and dodecaprenyl phosphate-galacturonic acid (DPGA) against fibroblast growth factors receptor (FGFR). Epi also exhibited excellent inhibitory activity against the platelet-derived growth factor receptor (PDGFR), while DPGA effectively inhibited the vascular endothelial growth factor receptor. Additionally, the presence Epi in S. clavuligerus extract was validated through the HPLC technique. Thus, our findings highlight a superior inhibitory potential of Epi against FGFR and PDGFR RTKs than the FDA-approved drug.


Assuntos
Neoplasias , Inibidores de Proteínas Quinases , Streptomyces , Masculino , Humanos , Inibidores de Proteínas Quinases/farmacologia , Simulação de Acoplamento Molecular , Fator A de Crescimento do Endotélio Vascular , Epirubicina , Células MCF-7
16.
Biochem Biophys Res Commun ; 731: 150383, 2024 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-39024977

RESUMO

(R)-selective transaminases have the potential to act as efficient biocatalysts for the synthesis of important pharmaceutical intermediates. However, their low catalytic efficiency and unfavorable equilibrium limit their industrial application. Seven (R)-selective transaminases were identified using homologous sequence mining. Beginning with the optimal candidate from Mycolicibacterium hippocampi, virtual mutagenesis and substrate tunnel engineering were performed to improve catalytic efficiency. The obtained variant, T282S/Q137E, exhibited 3.68-fold greater catalytic efficiency (kcat/Km) than the wild-type enzyme. Using substrate fed-batch and air sweeping processes, effective conversion of 100 mM 4-hydroxy-2-butanone was achieved with a conversion rate of 93 % and an ee value > 99.9 %. This study provides a basis for mutation of (R)-selective transaminases and offers an efficient biocatalytic process for the asymmetric synthesis of (R)-3-aminobutanol.


Assuntos
Engenharia de Proteínas , Transaminases , Transaminases/metabolismo , Transaminases/genética , Transaminases/química , Engenharia de Proteínas/métodos , Especificidade por Substrato , Sítios de Ligação , Biocatálise , Mutagênese , Mutagênese Sítio-Dirigida , Modelos Moleculares , Burkholderiaceae/enzimologia , Burkholderiaceae/genética , Cinética
17.
Biostatistics ; 24(4): 1031-1044, 2023 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-35536588

RESUMO

Experimental design usually focuses on the setting where treatments and/or other aspects of interest can be manipulated. However, in observational biomedical studies with sequential processing, the set of available samples is often fixed, and the problem is thus rather the ordering and allocation of samples to batches such that comparisons between different treatments can be made with similar precision. In certain situations, this allocation can be done by hand, but this rapidly becomes impractical with more challenging cohort setups. Here, we present a fast and intuitive algorithm to generate balanced allocations of samples to batches for any single-variable model where the treatment variable is nominal. This greatly simplifies the grouping of samples into batches, makes the process reproducible, and provides a marked improvement over completely random allocations. The general challenges of allocation and why good solutions can be hard to find are also discussed, as well as potential extensions to multivariable settings.


Assuntos
Algoritmos , Estudos Observacionais como Assunto , Humanos , Projetos de Pesquisa
18.
Biostatistics ; 24(3): 635-652, 2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-34893807

RESUMO

Nonignorable technical variation is commonly observed across data from multiple experimental runs, platforms, or studies. These so-called batch effects can lead to difficulty in merging data from multiple sources, as they can severely bias the outcome of the analysis. Many groups have developed approaches for removing batch effects from data, usually by accommodating batch variables into the analysis (one-step correction) or by preprocessing the data prior to the formal or final analysis (two-step correction). One-step correction is often desirable due it its simplicity, but its flexibility is limited and it can be difficult to include batch variables uniformly when an analysis has multiple stages. Two-step correction allows for richer models of batch mean and variance. However, prior investigation has indicated that two-step correction can lead to incorrect statistical inference in downstream analysis. Generally speaking, two-step approaches introduce a correlation structure in the corrected data, which, if ignored, may lead to either exaggerated or diminished significance in downstream applications such as differential expression analysis. Here, we provide more intuitive and more formal evaluations of the impacts of two-step batch correction compared to existing literature. We demonstrate that the undesired impacts of two-step correction (exaggerated or diminished significance) depend on both the nature of the study design and the batch effects. We also provide strategies for overcoming these negative impacts in downstream analyses using the estimated correlation matrix of the corrected data. We compare the results of our proposed workflow with the results from other published one-step and two-step methods and show that our methods lead to more consistent false discovery controls and power of detection across a variety of batch effect scenarios. Software for our method is available through GitHub (https://github.com/jtleek/sva-devel) and will be available in future versions of the $\texttt{sva}$ R package in the Bioconductor project (https://bioconductor.org/packages/release/bioc/html/sva.html).


Assuntos
Expressão Gênica , Humanos , Filogenia , Projetos de Pesquisa
19.
Small ; : e2403187, 2024 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-39092678

RESUMO

2D materials with atomically thin nature are promising to develop scaled transistors and enable the extreme miniaturization of electronic components. However, batch manufacturing of top-gate 2D transistors remains a challenge since gate dielectrics or gate electrodes transferred from 2D material easily peel away as gate pitch decreases to the nanometer scale during lift-off processes. In this study, an oxidation-assisted etching technique is developed for batch manufacturing of nanopatterned high-κ/metal gate (HKMG) stacks on 2D materials. This strategy produces nano-pitch self-oxidized Al2O3/Al patterns with a resolution of 150 nm on 2D channel material, including graphene, MoS2, and WS2 without introducing any additional damage. Through a gate-first technology in which the Al2O3/Al gate stacks are used as a mask for the formation of source and drain, a short-channel HKMG MoS2 transistor with a nearly ideal subthreshold swing (SS) of 61 mV dec-1, and HKMG graphene transistor with a cut-off frequency of 150 GHz are achieved. Moreover, both graphene and MoS2 HKMG transistor arrays exhibit high uniformity. The study may bring the potential for the massive production of large-scale integrated circuits using 2D materials.

20.
Chembiochem ; 25(9): e202400006, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38457364

RESUMO

High cell density cultivation is an established method for the production of various industrially important products such as recombinant proteins. However, these protocols are not always suitable for biocatalytic processes as the focus often lies on biomass production rather than high specific activities of the enzyme inside the cells. In contrast, a range of shake flask protocols are well known with high specific activities but rather low cell densities. To overcome this gap, we established a tailor-made fed-batch protocol combining both aspects: high cell density and high specific activities of heterologously produced enzyme. Using the example of an industrially relevant amine transaminase from Bacillus megaterium, we describe a strategy to optimize the cultivation yield based on the feed rate, IPTG concentration, and post-induction temperature. By adjusting these key parameters, we were able to increase the specific activity by 2.6-fold and the wet cell weight by even 17-fold compared to shake flasks. Finally, we were able to verify our established protocol by transferring it to another experimenter. With that, our optimization strategy can serve as a template for the production of high titers of heterologously produced, active enzymes and might enable the availability of these catalysts for upscaling biocatalytic processes.


Assuntos
Bacillus megaterium , Escherichia coli , Transaminases , Bacillus megaterium/enzimologia , Bacillus megaterium/metabolismo , Transaminases/metabolismo , Transaminases/genética , Escherichia coli/metabolismo , Escherichia coli/genética , Proteínas Recombinantes/metabolismo , Proteínas Recombinantes/biossíntese , Proteínas Recombinantes/genética , Aminas/metabolismo , Aminas/química , Biocatálise
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa