Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

Scalable lipid droplet microarray fabrication, validation, and screening.

Bell, Tracey N; Kusi-Appiah, Aubrey E; Tocci, Vincent; Lyu, Pengfei; Zhu, Lei; Zhu, Fanxiu; Van Winkle, David; Cao, Hongyuan; Singh, Mandip S; Lenhert, Steven.

PLoS One ; 19(7): e0304736, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38968248

RESUMO

High throughput screening of small molecules and natural products is costly, requiring significant amounts of time, reagents, and operating space. Although microarrays have proven effective in the miniaturization of screening for certain biochemical assays, such as nucleic acid hybridization or antibody binding, they are not widely used for drug discovery in cell culture due to the need for cells to internalize lipophilic drug candidates. Lipid droplet microarrays are a promising solution to this problem as they are capable of delivering lipophilic drugs to cells at dosages comparable to solution delivery. However, the scalablility of the array fabrication, assay validation, and screening steps has limited the utility of this approach. Here we take several new steps to scale up the process for lipid droplet array fabrication, assay validation in cell culture, and drug screening. A nanointaglio printing process has been adapted for use with a printing press. The arrays are stabilized for immersion into aqueous solution using a vapor coating process. In addition to delivery of lipophilic compounds, we found that we are also able to encapsulate and deliver a water-soluble compound in this way. The arrays can be functionalized by extracellular matrix proteins such as collagen prior to cell culture as the mechanism for uptake is based on direct contact with the lipid delivery vehicles rather than diffusion of the drug out of the microarray spots. We demonstrate this method for delivery to 3 different cell types and the screening of 92 natural product extracts on a microarray covering an area of less than 0.1 cm2. The arrays are suitable for miniaturized screening, for instance in high biosafety level facilities where space is limited and for applications where cell numbers are limited, such as in functional precision medicine.

Assuntos

Gotículas Lipídicas , Humanos , Gotículas Lipídicas/metabolismo , Análise em Microsséries/métodos , Animais , Avaliação Pré-Clínica de Medicamentos/métodos , Ensaios de Triagem em Larga Escala/métodos

2.

EndoPRS: Incorporating Endophenotype Information to Improve Polygenic Risk Scores for Clinical Endpoints.

Kharitonova, Elena V; Sun, Quan; Ockerman, Frank; Chen, Brian; Zhou, Laura Y; Cao, Hongyuan; Mathias, Rasika A; Auer, Paul L; Ober, Carole; Raffield, Laura M; Reiner, Alexander P; Cox, Nancy J; Kelada, Samir; Tao, Ran; Li, Yun.

medRxiv ; 2024 May 24.

Artigo em Inglês | MEDLINE | ID: mdl-38826253

RESUMO

Polygenic risk score (PRS) prediction of complex diseases can be improved by leveraging related phenotypes. This has motivated the development of several multi-trait PRS methods that jointly model information from genetically correlated traits. However, these methods do not account for vertical pleiotropy between traits, in which one trait acts as a mediator for another. Here, we introduce endoPRS, a weighted lasso model that incorporates information from relevant endophenotypes to improve disease risk prediction without making assumptions about the genetic architecture underlying the endophenotype-disease relationship. Through extensive simulation analysis, we demonstrate the robustness of endoPRS in a variety of complex genetic frameworks. We also apply endoPRS to predict the risk of childhood onset asthma in UK Biobank by leveraging a paired GWAS of eosinophil count, a relevant endophenotype. We find that endoPRS significantly improves prediction compared to many existing PRS methods, including multi-trait PRS methods, MTAG and wMT-BLUP, which suggests advantages of endoPRS in real-life clinical settings.

3.

A powerful approach to identify replicable variants in genome-wide association studies.

Li, Yan; Lei, Haochen; Wen, Xiaoquan; Cao, Hongyuan.

Am J Hum Genet ; 111(5): 966-978, 2024 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-38701746

RESUMO

Replicability is the cornerstone of modern scientific research. Reliable identifications of genotype-phenotype associations that are significant in multiple genome-wide association studies (GWASs) provide stronger evidence for the findings. Current replicability analysis relies on the independence assumption among single-nucleotide polymorphisms (SNPs) and ignores the linkage disequilibrium (LD) structure. We show that such a strategy may produce either overly liberal or overly conservative results in practice. We develop an efficient method, ReAD, to detect replicable SNPs associated with the phenotype from two GWASs accounting for the LD structure. The local dependence structure of SNPs across two heterogeneous studies is captured by a four-state hidden Markov model (HMM) built on two sequences of p values. By incorporating information from adjacent locations via the HMM, our approach provides more accurate SNP significance rankings. ReAD is scalable, platform independent, and more powerful than existing replicability analysis methods with effective false discovery rate control. Through analysis of datasets from two asthma GWASs and two ulcerative colitis GWASs, we show that ReAD can identify replicable genetic loci that existing methods might otherwise miss.

Assuntos

Asma , Estudo de Associação Genômica Ampla , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Humanos , Asma/genética , Cadeias de Markov , Colite Ulcerativa/genética , Reprodutibilidade dos Testes , Fenótipo , Genótipo

4.

Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function.

Boadu, Frimpong; Cao, Hongyuan; Cheng, Jianlin.

Bioinformatics ; 39(39 Suppl 1): i318-i325, 2023 06 30.

Artigo em Inglês | MEDLINE | ID: mdl-37387145

RESUMO

MOTIVATION: Millions of protein sequences have been generated by numerous genome and transcriptome sequencing projects. However, experimentally determining the function of the proteins is still a time consuming, low-throughput, and expensive process, leading to a large protein sequence-function gap. Therefore, it is important to develop computational methods to accurately predict protein function to fill the gap. Even though many methods have been developed to use protein sequences as input to predict function, much fewer methods leverage protein structures in protein function prediction because there was lack of accurate protein structures for most proteins until recently. RESULTS: We developed TransFun-a method using a transformer-based protein language model and 3D-equivariant graph neural networks to distill information from both protein sequences and structures to predict protein function. It extracts feature embeddings from protein sequences using a pre-trained protein language model (ESM) via transfer learning and combines them with 3D structures of proteins predicted by AlphaFold2 through equivariant graph neural networks. Benchmarked on the CAFA3 test dataset and a new test dataset, TransFun outperforms several state-of-the-art methods, indicating that the language model and 3D-equivariant graph neural networks are effective methods to leverage protein sequences and structures to improve protein function prediction. Combining TransFun predictions and sequence similarity-based predictions can further increase prediction accuracy. AVAILABILITY AND IMPLEMENTATION: The source code of TransFun is available at https://github.com/jianlin-cheng/TransFun.

Assuntos

Benchmarking , Idioma , Sequência de Aminoácidos , Redes Neurais de Computação , Software

5.

JUMP: replicability analysis of high-throughput experiments with applications to spatial transcriptomic studies.

Lyu, Pengfei; Li, Yan; Wen, Xiaoquan; Cao, Hongyuan.

Bioinformatics ; 39(6)2023 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-37279733

RESUMO

MOTIVATION: Replicability is the cornerstone of scientific research. The current statistical method for high-dimensional replicability analysis either cannot control the false discovery rate (FDR) or is too conservative. RESULTS: We propose a statistical method, JUMP, for the high-dimensional replicability analysis of two studies. The input is a high-dimensional paired sequence of p-values from two studies and the test statistic is the maximum of p-values of the pair. JUMP uses four states of the p-value pairs to indicate whether they are null or non-null. Conditional on the hidden states, JUMP computes the cumulative distribution function of the maximum of p-values for each state to conservatively approximate the probability of rejection under the composite null of replicability. JUMP estimates unknown parameters and uses a step-up procedure to control FDR. By incorporating different states of composite null, JUMP achieves a substantial power gain over existing methods while controlling the FDR. Analyzing two pairs of spatially resolved transcriptomic datasets, JUMP makes biological discoveries that otherwise cannot be obtained by using existing methods. AVAILABILITY AND IMPLEMENTATION: An R package JUMP implementing the JUMP method is available on CRAN (https://CRAN.R-project.org/package=JUMP).

Assuntos

Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos

6.

Correction to "Diastereoselective Synthesis of Chromeno[3,2-d]isoxazoles via Brønsted Acid Catalyzed Tandem 1,6-Addition/Double Annulations of o-Hydroxyl Propargylic Alcohols".

Li, Zhu; Zhang, Pei-Xu; Li, Zhao-Zhao; Zhang, Xing-Lu; Cao, Hong-Yuan; Gao, Yu-Ning; Bian, Ming; Chen, Hui-Yu; Liu, Zhen-Jiang.

Org Lett ; 25(15): 2750, 2023 Apr 21.

Artigo em Inglês | MEDLINE | ID: mdl-37027818

7.

Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function.

Boadu, Frimpong; Cao, Hongyuan; Cheng, Jianlin.

bioRxiv ; 2023 Jan 20.

Artigo em Inglês | MEDLINE | ID: mdl-36711471

RESUMO

Motivation: Millions of protein sequences have been generated by numerous genome and transcriptome sequencing projects. However, experimentally determining the function of the proteins is still a time consuming, low-throughput, and expensive process, leading to a large protein sequence-function gap. Therefore, it is important to develop computational methods to accurately predict protein function to fill the gap. Even though many methods have been developed to use protein sequences as input to predict function, much fewer methods leverage protein structures in protein function prediction because there was lack of accurate protein structures for most proteins until recently. Results: We developed TransFun - a method using a transformer-based protein language model and 3D-equivariant graph neural networks to distill information from both protein sequences and structures to predict protein function. It extracts feature embeddings from protein sequences using a pre-trained protein language model (ESM) via transfer learning and combines them with 3D structures of proteins predicted by AlphaFold2 through equivariant graph neural networks. Benchmarked on the CAFA3 test dataset and a new test dataset, TransFun outperforms several state-of-the-art methods, indicating the language model and 3D-equivariant graph neural networks are effective methods to leverage protein sequences and structures to improve protein function prediction. Combining TransFun predictions and sequence similarity-based predictions can further increase prediction accuracy. Availability: The source code of TransFun is available at https://github.com/jianlin-cheng/TransFun. Contact: chengji@missouri.edu.

8.

Proposal for collinear integrated acousto-optic tunable filters featuring ultrawide tuning ranges and multi-band operations.

Pan, Bingcheng; Cao, Hongyuan; Li, Huan; Dai, Daoxin.

Opt Express ; 30(14): 24747-24761, 2022 Jul 04.

Artigo em Inglês | MEDLINE | ID: mdl-36237021

RESUMO

Integrated optical tunable filters are key components for a wide spectrum of applications, including optical communications and interconnects, spectral analysis, and tunable light sources, among others. Compared with their thermo-optic counterparts, integrated acousto-optic (AO) tunable filters provide a unique approach to achieve superior performance, including ultrawide continuous tuning ranges of hundreds of nm, low power consumption of sub-mW and fast tuning speed of sub-µs. Based on suspended one-dimensional (1D) AO waveguides in the collinear configuration, we propose and theoretically investigate an innovative family of integrated AO tunable filters (AOTFs) on thin-film lithium niobate. The AO waveguides perform as tunable wavelength-selective narrow-band polarization rotators, where highly efficient conversion between co-propagating TE0 and TM0 modes is enabled by the torsional acoustic A1 mode, which can be selectively excited by a novel antisymmetric wavefront interdigital transducer. Furthermore, we systematically and quantitatively explore the possibilities of exciting modulated acoustic waves, which contain multiple frequency components, along the AO waveguide to achieve independently reconfigurable multi-band operations, with tunable time-variant spectral shapes. By incorporating a complete set of ultrawide-band polarization-handling components, we have proposed and theoretically investigated several representative monolithic AOTF configurations, featuring different arrangements of single or cascaded identical AO waveguides. One of the present AOTF designs exhibits a theoretical linewidth of â¼8ânm (â¼4ânm), a sidelobe suppression ratio of â¼75âdB, and theoretically no excess loss at the center wavelength of 1550ânm (1310ânm), with an ultrawide tuning range of 1.25-1.65âµm (from O-band to L-band), a fast tuning speed of 0.14 µs, and a low power consumption of a few mW.

9.

Diastereoselective Synthesis of Chromeno[3,2-d]isoxazoles via Brønsted Acid Catalyzed Tandem 1,6-Addition/Double Annulations of o-Hydroxyl Propargylic Alcohols.

Li, Zhu; Zhang, Pei-Xu; Li, Zhao-Zhao; Zhang, Xing-Lu; Cao, Hong-Yuan; Gao, Yu-Ning; Bian, Ming; Chen, Hui-Yu; Liu, Zhen-Jiang.

Org Lett ; 24(37): 6863-6868, 2022 Sep 23.

Artigo em Inglês | MEDLINE | ID: mdl-36102802

RESUMO

A Brønsted acid catalyzed tandem process to access densely functionalized chromeno[3,2-d]isoxazoles with good to excellent yields and diastereoselectivities was disclosed. The procedure is proposed to involve a 1,6-conjugate addition/electrophilic addition/double annulations process of alkynyl o-quinone methides (o-AQMs) in situ generated from o-hydroxyl propargylic alcohols with nitrones. Mild conditions, good functional group compatibility, easy scale-up of the reaction, and further product transformation demonstrated its potential application.

10.

Statistical analysis of spatially resolved transcriptomic data by incorporating multiomics auxiliary information.

Li, Yan; Zhou, Xiang; Cao, Hongyuan.

Genetics ; 221(4)2022 07 30.

Artigo em Inglês | MEDLINE | ID: mdl-35731210

RESUMO

Effective control of false discovery rate is key for multiplicity problems. Here, we consider incorporating informative covariates from external datasets in the multiple testing procedure to boost statistical power while maintaining false discovery rate control. In particular, we focus on the statistical analysis of innovative high-dimensional spatial transcriptomic data while incorporating external multiomics data that provide distinct but complementary information to the detection of spatial expression patterns. We extend OrderShapeEM, an efficient covariate-assisted multiple testing procedure that incorporates one auxiliary study, to make it permissible to incorporate multiple external omics studies, to boost statistical power of spatial expression pattern detection. Specifically, we first use a recently proposed computationally efficient statistical analysis method, spatial pattern recognition via kernels, to produce the primary test statistics for spatial transcriptomic data. Afterwards, we construct the auxiliary covariate by combining information from multiple external omics studies, such as bulk and single-cell RNA-seq data using the Cauchy combination rule. Finally, we extend and implement the integrative analysis method OrderShapeEM on the primary P-values along with auxiliary data incorporating multiomics information for efficient covariate-assisted spatial expression analysis. We conduct a series of realistic simulations to evaluate the performance of our method with known ground truth. Four case studies in mouse olfactory bulb, mouse cerebellum, human breast cancer, and human heart tissues further demonstrate the substantial power gain of our method in detecting genes with spatial expression patterns compared to existing classic approaches that do not utilize any external information.

Assuntos

Transcriptoma , Animais , Humanos , Camundongos

11.

Regression analysis of additive hazards model with sparse longitudinal covariates.

Sun, Zhuowei; Cao, Hongyuan; Chen, Li.

Lifetime Data Anal ; 28(2): 263-281, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35147908

RESUMO

Additive hazards model is often used to complement the proportional hazards model in the analysis of failure time data. Statistical inference of additive hazards model with time-dependent longitudinal covariates requires the availability of the whole trajectory of the longitudinal process, which is not realistic in practice. The commonly used last value carried forward approach for intermittently observed longitudinal covariates can induce biased parameter estimation. The more principled joint modeling of the longitudinal process and failure time data imposes strong modeling assumptions, which is difficult to verify. In this paper, we propose methods that weigh the distance between the observational time of longitudinal covariates and the failure time, resulting in unbiased regression coefficient estimation. We establish the consistency and asymptotic normality of the proposed estimators. Simulation studies provide numerical support for the theoretical findings. Data from an Alzheimer's study illustrate the practical utility of the methodology.

Assuntos

Modelos de Riscos Proporcionais , Simulação por Computador , Humanos , Análise de Regressão

12.

OPTIMAL FALSE DISCOVERY RATE CONTROL FOR LARGE SCALE MULTIPLE TESTING WITH AUXILIARY INFORMATION.

Cao, Hongyuan; Chen, Jun; Zhang, Xianyang.

Ann Stat ; 50(2): 807-857, 2022 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37138896

RESUMO

Large-scale multiple testing is a fundamental problem in high dimensional statistical inference. It is increasingly common that various types of auxiliary information, reflecting the structural relationship among the hypotheses, are available. Exploiting such auxiliary information can boost statistical power. To this end, we propose a framework based on a two-group mixture model with varying probabilities of being null for different hypotheses a priori, where a shape-constrained relationship is imposed between the auxiliary information and the prior probabilities of being null. An optimal rejection rule is designed to maximize the expected number of true positives when average false discovery rate is controlled. Focusing on the ordered structure, we develop a robust EM algorithm to estimate the prior probabilities of being null and the distribution of p-values under the alternative hypothesis simultaneously. We show that the proposed method has better power than state-of-the-art competitors while controlling the false discovery rate, both empirically and theoretically. Extensive simulations demonstrate the advantage of the proposed method. Datasets from genome-wide association studies are used to illustrate the new methodology.

13.

On computation of semiparametric maximum likelihood estimators with shape constraints.

Wang, Yudong; Ye, Zhi-Sheng; Cao, Hongyuan.

Biometrics ; 77(1): 113-124, 2021 03.

Artigo em Inglês | MEDLINE | ID: mdl-32271941

RESUMO

Large sample theory of semiparametric models based on maximum likelihood estimation (MLE) with shape constraint on the nonparametric component is well studied. Relatively less attention has been paid to the computational aspect of semiparametric MLE. The computation of semiparametric MLE based on existing approaches such as the expectation-maximization (EM) algorithm can be computationally prohibitive when the missing rate is high. In this paper, we propose a computational framework for semiparametric MLE based on an inexact block coordinate ascent (BCA) algorithm. We show theoretically that the proposed algorithm converges. This computational framework can be applied to a wide range of data with different structures, such as panel count data, interval-censored data, and degradation data, among others. Simulation studies demonstrate favorable performance compared with existing algorithms in terms of accuracy and speed. Two data sets are used to illustrate the proposed computational method. We further implement the proposed computational method in R package BCA1SG, available at CRAN.

Assuntos

Algoritmos , Modelos Estatísticos , Simulação por Computador , Funções Verossimilhança

14.

Intensive Surveillance with Biannual Dynamic Contrast-Enhanced Magnetic Resonance Imaging Downstages Breast Cancer in BRCA1 Mutation Carriers.

Guindalini, Rodrigo Santa Cruz; Zheng, Yonglan; Abe, Hiroyuki; Whitaker, Kristen; Yoshimatsu, Toshio F; Walsh, Tom; Schacht, David; Kulkarni, Kirti; Sheth, Deepa; Verp, Marion S; Bradbury, Angela R; Churpek, Jane; Obeid, Elias; Mueller, Jeffrey; Khramtsova, Galina; Liu, Fang; Raoul, Akila; Cao, Hongyuan; Romero, Iris L; Hong, Susan; Livingston, Robert; Jaskowiak, Nora; Wang, Xiaoming; Debiasi, Marcio; Pritchard, Colin C; King, Mary-Claire; Karczmar, Gregory; Newstead, Gillian M; Huo, Dezheng; Olopade, Olufunmilayo I.

Clin Cancer Res ; 25(6): 1786-1794, 2019 03 15.

Artigo em Inglês | MEDLINE | ID: mdl-30154229

RESUMO

PURPOSE: To establish a cohort of high-risk women undergoing intensive surveillance for breast cancer.Experimental Design: We performed dynamic contrast-enhanced MRI every 6 months in conjunction with annual mammography (MG). Eligible participants had a cumulative lifetime breast cancer risk ≥20% and/or tested positive for a pathogenic mutation in a known breast cancer susceptibility gene. RESULTS: Between 2004 and 2016, we prospectively enrolled 295 women, including 157 mutation carriers (75 BRCA1, 61 BRCA2); participants' mean age at entry was 43.3 years. Seventeen cancers were later diagnosed: 4 ductal carcinoma in situ (DCIS) and 13 early-stage invasive breast cancers. Fifteen cancers occurred in mutation carriers (11 BRCA1, 3 BRCA2, 1 CDH1). Median size of the invasive cancers was 0.61 cm. No patients had lymph node metastasis at time of diagnosis, and no interval invasive cancers occurred. The sensitivity of biannual MRI alone was 88.2% and annual MG plus biannual MRI was 94.1%. The cancer detection rate of biannual MRI alone was 0.7% per 100 screening episodes, which is similar to the cancer detection rate of 0.7% per 100 screening episodes for annual MG plus biannual MRI. The number of recalls and biopsies needed to detect one cancer by biannual MRI were 2.8 and 1.7 in BRCA1 carriers, 12.0 and 8.0 in BRCA2 carriers, and 11.7 and 5.0 in non-BRCA1/2 carriers, respectively. CONCLUSIONS: Biannual MRI performed well for early detection of invasive breast cancer in genomically stratified high-risk women. No benefit was associated with annual MG screening plus biannual MRI screening.See related commentary by Kuhl and Schrading, p. 1693.

Assuntos

Proteína BRCA1/genética , Neoplasias da Mama/diagnóstico , Detecção Precoce de Câncer/métodos , Imageamento por Ressonância Magnética/métodos , Programas de Rastreamento/métodos , Adulto , Biópsia , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Feminino , Predisposição Genética para Doença , Humanos , Mamografia , Pessoa de Meia-Idade , Mutação , Estadiamento de Neoplasias , Estudos Prospectivos

15.

Efficacy of Anti-HER2 Agents in Combination With Adjuvant or Neoadjuvant Chemotherapy for Early and Locally Advanced HER2-Positive Breast Cancer Patients: A Network Meta-Analysis.

Debiasi, Márcio; Polanczyk, Carisi A; Ziegelmann, Patrícia; Barrios, Carlos; Cao, Hongyuan; Dignam, James J; Goss, Paul; Bychkovsky, Brittany; Finkelstein, Dianne M; Guindalini, Rodrigo S; Filho, Paulo; Albuquerque, Caroline; Reinert, Tomás; de Azambuja, Evandro; Olopade, Olufunmilayo.

Front Oncol ; 8: 156, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-29872641

RESUMO

BACKGROUND: Several (neo)adjuvant treatments for patients with HER2-positive breast cancer have been compared in different randomized clinical trials. Since it is not feasible to conduct adequate pairwise comparative trials of all these therapeutic options, network meta-analysis offers an opportunity for more detailed inference for evidence-based therapy. METHODS: Phase II/III randomized clinical trials comparing two or more different (neo)adjuvant treatments for HER2-positive breast cancer patients were included. Relative treatment effects were pooled in two separate network meta-analyses for overall survival (OS) and disease-free survival (DFS). RESULTS: 17 clinical trials met our eligibility criteria. Two different networks of trials were created based on the availability of the outcomes: OS network (15 trials: 37,837 patients); and DFS network (17 trials: 40,992 patients). Two studies-the ExteNET and the NeoSphere trials-were included only in this DFS network because OS data have not yet been reported. The concept of the dual anti-HER2 blockade proved to be the best option in terms of OS and DFS. Chemotherapy (CT) plus trastuzumab (T) and lapatinib (L) and CT + T + Pertuzumab (P) are probably the best treatment options in terms of OS, with 62.47% and 22.06%, respectively. In the DFS network, CT + T + Neratinib (N) was the best treatment option with 50.55%, followed by CT + T + P (26.59%) and CT + T + L (20.62%). CONCLUSION: This network meta-analysis suggests that dual anti-HER2 blockade with trastuzumab plus either lapatinib or pertuzumab are probably the best treatment options in the (neo)adjuvant setting for HER2-positive breast cancer patients in terms of OS gain. Mature OS results are still expected for the Aphinity trial and for the sequential use of trastuzumab followed by neratinib, the treatment that showed the best performance in terms of DFS in our analysis.

16.

False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing.

Xiao, Jian; Cao, Hongyuan; Chen, Jun.

Bioinformatics ; 33(18): 2873-2881, 2017 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-28505251

RESUMO

MOTIVATION: Next generation sequencing technologies have enabled the study of the human microbiome through direct sequencing of microbial DNA, resulting in an enormous amount of microbiome sequencing data. One unique characteristic of microbiome data is the phylogenetic tree that relates all the bacterial species. Closely related bacterial species have a tendency to exhibit a similar relationship with the environment or disease. Thus, incorporating the phylogenetic tree information can potentially improve the detection power for microbiome-wide association studies, where hundreds or thousands of tests are conducted simultaneously to identify bacterial species associated with a phenotype of interest. Despite much progress in multiple testing procedures such as false discovery rate (FDR) control, methods that take into account the phylogenetic tree are largely limited. RESULTS: We propose a new FDR control procedure that incorporates the prior structure information and apply it to microbiome data. The proposed procedure is based on a hierarchical model, where a structure-based prior distribution is designed to utilize the phylogenetic tree. By borrowing information from neighboring bacterial species, we are able to improve the statistical power of detecting associated bacterial species while controlling the FDR at desired levels. When the phylogenetic tree is mis-specified or non-informative, our procedure achieves a similar power as traditional procedures that do not take into account the tree structure. We demonstrate the performance of our method through extensive simulations and real microbiome datasets. We identified far more alcohol-drinking associated bacterial species than traditional methods. AVAILABILITY AND IMPLEMENTATION: R package StructFDR is available from CRAN. CONTACT: chen.jun2@mayo.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Bactérias/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Microbiota/genética , Filogenia , Software , Genômica/métodos , Humanos , Polimorfismo Genético , Análise de Sequência de DNA/métodos

17.

Clinical Evaluation of Cisplatin Sensitivity of Germline Polymorphisms in Neoadjuvant Chemotherapy for Urothelial Cancer.

O'Donnell, Peter H; Alanee, Shaheen; Stratton, Kelly L; Garcia-Grossman, Ilana R; Cao, Hongyuan; Ostrovnaya, Irina; Plimack, Elizabeth R; Manschreck, Christopher; Ganshert, Cory; Smith, Norm D; Steinberg, Gary D; Vijai, Joseph; Offit, Kenneth; Stadler, Walter M; Bajorin, Dean F.

Clin Genitourin Cancer ; 14(6): 511-517, 2016 12.

Artigo em Inglês | MEDLINE | ID: mdl-27150640

RESUMO

BACKGROUND: Level 1 evidence has demonstrated increased overall survival with cisplatin-based neoadjuvant chemotherapy for patients with muscle-invasive urothelial cancer. Usage remains low, however, in part because neoadjuvant chemotherapy will not be effective for every patient. To identify the patients most likely to benefit, we evaluated germline pharmacogenomic markers for association with neoadjuvant chemotherapy sensitivity in 2 large cohorts of patients with urothelial cancer. PATIENTS AND METHODS: Patients receiving neoadjuvant cisplatin-based chemotherapy for muscle-invasive urothelial cancer were eligible. Nine germline single nucleotide polymorphisms (SNPs) potentially conferring platinum sensitivity were tested for an association with a complete pathologic response to neoadjuvant chemotherapy (pT0) or elimination of muscle-invasive cancer (

Assuntos

Antineoplásicos/administração & dosagem , Carcinoma de Células de Transição/tratamento farmacológico , Cisplatino/administração & dosagem , Mutação em Linhagem Germinativa , Polimorfismo de Nucleotídeo Único , Adulto , Idoso , Idoso de 80 Anos ou mais , Antineoplásicos/farmacologia , Carcinoma de Células de Transição/genética , Carcinoma de Células de Transição/patologia , Cisplatino/farmacologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Terapia Neoadjuvante , Estadiamento de Neoplasias , Variantes Farmacogenômicos , Análise de Sobrevida , Resultado do Tratamento

18.

Assessing agreement with multiple raters on correlated kappa statistics.

Cao, Hongyuan; Sen, Pranab K; Peery, Anne F; Dellon, Evan S.

Biom J ; 58(4): 935-43, 2016 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-26890370

RESUMO

In clinical studies, it is often of interest to see the diagnostic agreement among clinicians on certain symptoms. Previous work has focused on the agreement between two clinicians under two different conditions or the agreement among multiple clinicians under one condition. Few have discussed the agreement study with a design where multiple clinicians examine the same group of patients under two different conditions. In this paper, we use the intraclass kappa statistic for assessing nominal scale agreement with such a design. We derive an explicit variance formula for the difference of correlated kappa statistics and conduct hypothesis testing for the equality of kappa statistics. Simulation studies show that the method performs well with realistic sample sizes and may be superior to a method that did not take into account the measurement dependence structure. The practical utility of the method is illustrated on data from an eosinophilic esophagitis (EoE) study.

Assuntos

Biometria/métodos , Técnicas e Procedimentos Diagnósticos/normas , Modelos Estatísticos , Simulação por Computador , Humanos , Reprodutibilidade dos Testes , Projetos de Pesquisa , Tamanho da Amostra

19.

Analysis of the Proportional Hazards Model with Sparse Longitudinal Covariates.

Cao, Hongyuan; Churpek, Mathew M; Zeng, Donglin; Fine, Jason P.

J Am Stat Assoc ; 110(511): 1187-1196, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26576066

RESUMO

Regression analysis of censored failure observations via the proportional hazards model permits time-varying covariates which are observed at death times. In practice, such longitudinal covariates are typically sparse and only measured at infrequent and irregularly spaced follow-up times. Full likelihood analyses of joint models for longitudinal and survival data impose stringent modelling assumptions which are difficult to verify in practice and which are complicated both inferentially and computationally. In this article, a simple kernel weighted score function is proposed with minimal assumptions. Two scenarios are considered: half kernel estimation in which observation ceases at the time of the event and full kernel estimation for data where observation may continue after the event, as with recurrent events data. It is established that these estimators are consistent and asymptotically normal. However, they converge at rates which are slower than the parametric rates which may be achieved with fully observed covariates, with the full kernel method achieving an optimal convergence rate which is superior to that of the half kernel method. Simulation results demonstrate that the large sample approximations are adequate for practical use and may yield improved performance relative to last value carried forward approach and joint modelling method. The analysis of the data from a cardiac arrest study demonstrates the utility of the proposed methods.

20.

Regression analysis of sparse asynchronous longitudinal data.

Cao, Hongyuan; Zeng, Donglin; Fine, Jason P.

J R Stat Soc Series B Stat Methodol ; 77(4): 755-776, 2015 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-26568699

RESUMO

We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA