Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Opt Express ; 31(26): 43771-43789, 2023 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-38178466

RESUMO

The vertical distribution of the diffuse attenuation coefficient K(z, λ) is critical for studies in bio-optics, ocean color remote sensing, underwater photovoltaic power, etc. It is a key apparent optical property (AOP) and is sensitive to the volume scattering function ß(ψ, z, λ). Here, using three machine learning algorithms (MLAs) (categorical boosting (CatBoost), light gradient boosting machine (LightGBM), and random forest (RF)), we developed a new approach for estimating the vertical distribution of Kd(z, 650), KLu(z, 650), and Ku(z, 650) and applied it to the South China Sea (SCS). In this approach, based on in situ ß(ψ, z, 650), the absorption coefficient a(z, 650), the profile depths z, and Kd(z, 650), KLu(z, 650), and Ku(z, 650) calculated by Hydrolight 6.0 (HL6.0), three machine learning models (MLMs) without or with boundary conditions for estimating Kd(z, 650), KLu(z, 650), and Ku(z, 650) were established, evaluated, compared, and applied. It was found that (1) CatBoost models have superior performance with R2 ≥ 0.92, RMSE≤ 0.021 m-1, and MAPE≤ 4.3% and most significantly agree with HL6.0 simulations; (2) there is a more satisfactory consistency between HL6.0 simulations and MLMs estimations while incorporating the boundary conditions; (3) the estimations of Kd(z, 650), KLu(z, 650), and Ku(z, 650) derived from CatBoost models with and without boundary conditions have a good agreement with R2 ≥0.992, RMSE ≤0.007 m-1, and MAPE≤0.8%, respectively; (4) there is an overall decreasing trend with increasing depth and increasing offshore distance of Kd(z, 650), KLu(z, 650), and Ku(z, 650) in the SCS. The MLMs for estimating K(z, λ) could provide more accurate information for the study of underwater light field distribution, water quality assessment and the validation of remote sensing data products.

2.
Stem Cells ; 40(3): 273-289, 2022 03 31.
Artigo em Inglês | MEDLINE | ID: mdl-35356986

RESUMO

Insulin-like growth factor I (IGF-1) has been implicated in breast cancer due to its mitogenic and anti-apoptotic effects. Despite substantial research on the role of IGF-1 in tumor progression, the relationship of IGF-1 to tissue stem cells, particularly in mammary tissue, and the resulting tumor susceptibility has not been elucidated. Previous studies with the BK5.IGF-1 transgenic (Tg) mouse model reveals that IGF-1 does not act as a classical, post-carcinogen tumor promoter in the mammary gland. Pre-pubertal Tg mammary glands display increased numbers and enlarged sizes of terminal end buds, a niche for mammary stem cells (MaSCs). Here we show that MaSCs from both wild-type (WT) and Tg mice expressed IGF-1R and that overexpression of Tg IGF-1 increased numbers of MaSCs by undergoing symmetric division, resulting in an expansion of the MaSC and luminal progenitor (LP) compartments in pre-pubertal female mice. This expansion was maintained post-pubertally and validated by mammosphere assays in vitro and transplantation assays in vivo. The addition of recombinant IGF-1 promoted, and IGF-1R downstream inhibitors decreased mammosphere formation. Single-cell transcriptomic profiles generated from 2 related platforms reveal that IGF-1 stimulated quiescent MaSCs to enter the cell cycle and increased their expression of genes involved in proliferation, plasticity, tumorigenesis, invasion, and metastasis. This study identifies a novel, pro-tumorigenic mechanism, where IGF-1 increases the number of transformation-susceptible carcinogen targets during the early stages of mammary tissue development, and "primes" their gene expression profiles for transformation.


Assuntos
Fator de Crescimento Insulin-Like I , Glândulas Mamárias Animais , Animais , Proliferação de Células , Feminino , Humanos , Fator de Crescimento Insulin-Like I/metabolismo , Camundongos , Camundongos Transgênicos , Células-Tronco/metabolismo
3.
Nat Biomed Eng ; 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38514775

RESUMO

Training machine-learning models with synthetically generated data can alleviate the problem of data scarcity when acquiring diverse and sufficiently large datasets is costly and challenging. Here we show that cascaded diffusion models can be used to synthesize realistic whole-slide image tiles from latent representations of RNA-sequencing data from human tumours. Alterations in gene expression affected the composition of cell types in the generated synthetic image tiles, which accurately preserved the distribution of cell types and maintained the cell fraction observed in bulk RNA-sequencing data, as we show for lung adenocarcinoma, kidney renal papillary cell carcinoma, cervical squamous cell carcinoma, colon adenocarcinoma and glioblastoma. Machine-learning models pretrained with the generated synthetic data performed better than models trained from scratch. Synthetic data may accelerate the development of machine-learning models in scarce-data settings and allow for the imputation of missing data modalities.

4.
bioRxiv ; 2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-37808782

RESUMO

Cancer is a heterogeneous disease that demands precise molecular profiling for better understanding and management. Recently, deep learning has demonstrated potentials for cost-efficient prediction of molecular alterations from histology images. While transformer-based deep learning architectures have enabled significant progress in non-medical domains, their application to histology images remains limited due to small dataset sizes coupled with the explosion of trainable parameters. Here, we develop SEQUOIA, a transformer model to predict cancer transcriptomes from whole-slide histology images. To enable the full potential of transformers, we first pre-train the model using data from 1,802 normal tissues. Then, we fine-tune and evaluate the model in 4,331 tumor samples across nine cancer types. The prediction performance is assessed at individual gene levels and pathway levels through Pearson correlation analysis and root mean square error. The generalization capacity is validated across two independent cohorts comprising 1,305 tumors. In predicting the expression levels of 25,749 genes, the highest performance is observed in cancers from breast, kidney and lung, where SEQUOIA accurately predicts the expression of 11,069, 10,086 and 8,759 genes, respectively. The accurately predicted genes are associated with the regulation of inflammatory response, cell cycles and metabolisms. While the model is trained at the tissue level, we showcase its potential in predicting spatial gene expression patterns using spatial transcriptomics datasets. Leveraging the prediction performance, we develop a digital gene expression signature that predicts the risk of recurrence in breast cancer. SEQUOIA deciphers clinically relevant gene expression patterns from histology images, opening avenues for improved cancer management and personalized therapies.

5.
bioRxiv ; 2023 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-36711917

RESUMO

DNA methylation (DNAme) is a major epigenetic factor influencing gene expression with alterations leading to cancer, immunological, and cardiovascular diseases. Recent technological advances enable genome-wide quantification of DNAme in large human cohorts. So far, existing methods have not been evaluated to identify differential DNAme present in large and heterogeneous patient cohorts. We developed an end-to-end analytical framework named "EpiMix" for population-level analysis of DNAme and gene expression. Compared to existing methods, EpiMix showed higher sensitivity in detecting abnormal DNAme that was present in only small patient subsets. We extended the model-based analyses of EpiMix to cis-regulatory elements within protein-coding genes, distal enhancers, and genes encoding microRNAs and lncRNAs. Using cell-type specific data from two separate studies, we discovered novel epigenetic mechanisms underlying childhood food allergy and survival-associated, methylation-driven non-coding RNAs in non-small cell lung cancer.

6.
Cell Rep Methods ; 3(7): 100515, 2023 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-37533639

RESUMO

DNA methylation (DNAme) is a major epigenetic factor influencing gene expression with alterations leading to cancer and immunological and cardiovascular diseases. Recent technological advances have enabled genome-wide profiling of DNAme in large human cohorts. There is a need for analytical methods that can more sensitively detect differential methylation profiles present in subsets of individuals from these heterogeneous, population-level datasets. We developed an end-to-end analytical framework named "EpiMix" for population-level analysis of DNAme and gene expression. Compared with existing methods, EpiMix showed higher sensitivity in detecting abnormal DNAme that was present in only small patient subsets. We extended the model-based analyses of EpiMix to cis-regulatory elements within protein-coding genes, distal enhancers, and genes encoding microRNAs and long non-coding RNAs (lncRNAs). Using cell-type-specific data from two separate studies, we discover epigenetic mechanisms underlying childhood food allergy and survival-associated, methylation-driven ncRNAs in non-small cell lung cancer.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Humanos , Criança , Metilação de DNA/genética , Carcinoma Pulmonar de Células não Pequenas/genética , Epigenômica/métodos , Neoplasias Pulmonares/diagnóstico , Epigênese Genética
7.
Nat Commun ; 14(1): 4122, 2023 07 11.
Artigo em Inglês | MEDLINE | ID: mdl-37433817

RESUMO

Intra-tumoral heterogeneity and cell-state plasticity are key drivers for the therapeutic resistance of glioblastoma. Here, we investigate the association between spatial cellular organization and glioblastoma prognosis. Leveraging single-cell RNA-seq and spatial transcriptomics data, we develop a deep learning model to predict transcriptional subtypes of glioblastoma cells from histology images. Employing this model, we phenotypically analyze 40 million tissue spots from 410 patients and identify consistent associations between tumor architecture and prognosis across two independent cohorts. Patients with poor prognosis exhibit higher proportions of tumor cells expressing a hypoxia-induced transcriptional program. Furthermore, a clustering pattern of astrocyte-like tumor cells is associated with worse prognosis, while dispersion and connection of the astrocytes with other transcriptional subtypes correlate with decreased risk. To validate these results, we develop a separate deep learning model that utilizes histology images to predict prognosis. Applying this model to spatial transcriptomics data reveal survival-associated regional gene expression programs. Overall, our study presents a scalable approach to unravel the transcriptional heterogeneity of glioblastoma and establishes a critical connection between spatial cellular architecture and clinical outcomes.


Assuntos
Glioblastoma , Humanos , Glioblastoma/genética , Astrócitos , Plasticidade Celular , Análise por Conglomerados , Perfilação da Expressão Gênica
8.
bioRxiv ; 2023 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-36711711

RESUMO

Data scarcity presents a significant obstacle in the field of biomedicine, where acquiring diverse and sufficient datasets can be costly and challenging. Synthetic data generation offers a potential solution to this problem by expanding dataset sizes, thereby enabling the training of more robust and generalizable machine learning models. Although previous studies have explored synthetic data generation for cancer diagnosis, they have predominantly focused on single modality settings, such as whole-slide image tiles or RNA-Seq data. To bridge this gap, we propose a novel approach, RNA-Cascaded-Diffusion-Model or RNA-CDM, for performing RNA-to-image synthesis in a multi-cancer context, drawing inspiration from successful text-to-image synthesis models used in natural images. In our approach, we employ a variational auto-encoder to reduce the dimensionality of a patient's gene expression profile, effectively distinguishing between different types of cancer. Subsequently, we employ a cascaded diffusion model to synthesize realistic whole-slide image tiles using the latent representation derived from the patient's RNA-Seq data. Our results demonstrate that the generated tiles accurately preserve the distribution of cell types observed in real-world data, with state-of-the-art cell identification models successfully detecting important cell types in the synthetic samples. Furthermore, we illustrate that the synthetic tiles maintain the cell fraction observed in bulk RNA-Seq data and that modifications in gene expression affect the composition of cell types in the synthetic tiles. Next, we utilize the synthetic data generated by RNA-CDM to pretrain machine learning models and observe improved performance compared to training from scratch. Our study emphasizes the potential usefulness of synthetic data in developing machine learning models in sarce-data settings, while also highlighting the possibility of imputing missing data modalities by leveraging the available information. In conclusion, our proposed RNA-CDM approach for synthetic data generation in biomedicine, particularly in the context of cancer diagnosis, offers a novel and promising solution to address data scarcity. By generating synthetic data that aligns with real-world distributions and leveraging it to pretrain machine learning models, we contribute to the development of robust clinical decision support systems and potential advancements in precision medicine.

9.
Commun Med (Lond) ; 3(1): 44, 2023 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-36991216

RESUMO

BACKGROUND: The introduction of deep learning in both imaging and genomics has significantly advanced the analysis of biomedical data. For complex diseases such as cancer, different data modalities may reveal different disease characteristics, and the integration of imaging with genomic data has the potential to unravel additional information than when using these data sources in isolation. Here, we propose a DL framework that combines these two modalities with the aim to predict brain tumor prognosis. METHODS: Using two separate glioma cohorts of 783 adults and 305 pediatric patients we developed a DL framework that can fuse histopathology images with gene expression profiles. Three strategies for data fusion were implemented and compared: early, late, and joint fusion. Additional validation of the adult glioma models was done on an independent cohort of 97 adult patients. RESULTS: Here we show that the developed multimodal data models achieve better prediction results compared to the single data models, but also lead to the identification of more relevant biological pathways. When testing our adult models on a third brain tumor dataset, we show our multimodal framework is able to generalize and performs better on new data from different cohorts. Leveraging the concept of transfer learning, we demonstrate how our pediatric multimodal models can be used to predict prognosis for two more rare (less available samples) pediatric brain tumors. CONCLUSIONS: Our study illustrates that a multimodal data fusion approach can be successfully implemented and customized to model clinical outcome of adult and pediatric brain tumors.


An increasing amount of complex patient data is generated when treating patients with cancer, including histopathology data (where the appearance of a tumor is examined under a microscope) and molecular data (such as analysis of a tumor's genetic material). Computational methods to integrate these data types might help us to predict outcomes in patients with cancer. Here, we propose a deep learning method which involves computer software learning from patterns in the data, to combine histopathology and molecular data to predict outcomes in patients with brain cancers. Using three cohorts of patients, we show that our method combining the different datasets performs better than models using one data type. Methods like ours might help clinicians to better inform patients about their prognosis and make decisions about their care.

10.
Nat Med ; 29(3): 738-747, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36864252

RESUMO

Undetected infection and delayed isolation of infected individuals are key factors driving the monkeypox virus (now termed mpox virus or MPXV) outbreak. To enable earlier detection of MPXV infection, we developed an image-based deep convolutional neural network (named MPXV-CNN) for the identification of the characteristic skin lesions caused by MPXV. We assembled a dataset of 139,198 skin lesion images, split into training/validation and testing cohorts, comprising non-MPXV images (n = 138,522) from eight dermatological repositories and MPXV images (n = 676) from the scientific literature, news articles, social media and a prospective cohort of the Stanford University Medical Center (n = 63 images from 12 patients, all male). In the validation and testing cohorts, the sensitivity of the MPXV-CNN was 0.83 and 0.91, the specificity was 0.965 and 0.898 and the area under the curve was 0.967 and 0.966, respectively. In the prospective cohort, the sensitivity was 0.89. The classification performance of the MPXV-CNN was robust across various skin tones and body regions. To facilitate the usage of the algorithm, we developed a web-based app by which the MPXV-CNN can be accessed for patient guidance. The capability of the MPXV-CNN for identifying MPXV lesions has the potential to aid in MPXV outbreak mitigation.


Assuntos
Aprendizado Profundo , Mpox , Humanos , Masculino , Estudos Prospectivos , Monkeypox virus , Algoritmos
11.
Cells ; 11(16)2022 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-36010633

RESUMO

Diet is a critical environmental factor affecting breast cancer risk, and recent evidence shows that dietary exposures during early development can affect lifetime mammary cancer susceptibility. To elucidate the underlying mechanisms, we used our established crossover feeding mouse model, where exposure to a high-fat and high-sugar (HFHS) diet during defined developmental windows determines mammary tumor incidence and latency in carcinogen-treated mice. Mammary tumor incidence is significantly increased in mice receiving a HFHS post-weaning diet (high-tumor mice, HT) compared to those receiving a HFHS diet during gestation (low-tumor mice, LT). The current study revealed that the mammary stem cell (MaSC) population was significantly increased in mammary glands from HT compared to LT mice. Igf1 expression was increased in mammary stromal cells from HT mice, where it promoted MaSC self-renewal. The increased Igf1 expression was induced by DNA hypomethylation of the Igf1 Pr1 promoter, mediated by a decrease in Dnmt3b levels. Mammary tissues from HT mice also had reduced levels of Igfbp5, leading to increased bioavailability of tissue Igf1. This study provides novel insights into how early dietary exposures program mammary cancer risk, demonstrating that effective dietary intervention can reduce mammary cancer incidence.


Assuntos
Exposição Dietética , Neoplasias Mamárias Animais , Animais , Carcinógenos , Dieta , Neoplasias Mamárias Animais/metabolismo , Camundongos , Células-Tronco/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA