Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
J Comput Biol ; 30(10): 1075-1088, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37871292

RESUMEN

Rare variant association studies with multiple traits or diseases have drawn a lot of attention since association signals of rare variants can be boosted if more than one phenotype outcome is associated with the same rare variants. Most of the existing statistical methods to identify rare variants associated with multiple phenotypes are based on a group test, where a pre-specified genetic region is tested one at a time. However, these methods are not designed to locate susceptible rare variants within the genetic region. In this article, we propose new statistical methods to prioritize rare variants within a genetic region when a group test for the genetic region identifies a statistical association with multiple phenotypes. It computes the weighted selection probability (WSP) of individual rare variants and ranks them from largest to smallest according to their WSP. In simulation studies, we demonstrated that the proposed method outperforms other statistical methods in terms of true positive selection, when multiple phenotypes are correlated with each other. We also applied it to our soybean single nucleotide polymorphism (SNP) data with 13 highly correlated amino acids, where we identified some potentially susceptible rare variants in chromosome 19.


Asunto(s)
Glycine max , Polimorfismo de Nucleótido Simple , Estudios de Asociación Genética , Glycine max/genética , Fenotipo , Simulación por Computador , Modelos Genéticos , Variación Genética/genética , Estudio de Asociación del Genoma Completo/métodos
2.
BMC Bioinformatics ; 24(1): 381, 2023 Oct 10.
Artículo en Inglés | MEDLINE | ID: mdl-37817069

RESUMEN

BACKGROUND: Identification of pleiotropic variants associated with multiple phenotypic traits has received increasing attention in genetic association studies. Overlapping genetic associations from multiple traits help to detect weak genetic associations missed by single-trait analyses. Many statistical methods were developed to identify pleiotropic variants with most of them being limited to quantitative traits when pleiotropic effects on both quantitative and qualitative traits have been observed. This is a statistically challenging problem because there does not exist an appropriate multivariate distribution to model both quantitative and qualitative data together. Alternatively, meta-analysis methods can be applied, which basically integrate summary statistics of individual variants associated with either a quantitative or a qualitative trait without accounting for correlations among genetic variants. RESULTS: We propose a new statistical selection method based on a unified selection score quantifying how a genetic variant, i.e., a pleiotropic variant associates with both quantitative and qualitative traits. In our extensive simulation studies where various types of pleiotropic effects on both quantitative and qualitative traits were considered, we demonstrated that the proposed method outperforms the existing meta-analysis methods in terms of true positive selection. We also applied the proposed method to a peanut dataset with 6 quantitative and 2 qualitative traits, and a cowpea dataset with 2 quantitative and 6 qualitative traits. We were able to detect some potentially pleiotropic variants missed by the existing methods in both analyses. CONCLUSIONS: The proposed method is able to locate pleiotropic variants associated with both quantitative and qualitative traits. It has been implemented into an R package 'UNISS', which can be downloaded from http://github.com/statpng/uniss.


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Simulación por Computador , Estudios de Asociación Genética , Fenotipo
3.
Genes (Basel) ; 12(1)2020 12 22.
Artículo en Inglés | MEDLINE | ID: mdl-33375051

RESUMEN

Peanut (Arachis hypogaea L.) is one of the important oil crops of the world. In this study, we aimed to evaluate the genetic diversity of 384 peanut germplasms including 100 Korean germplasms and 284 core collections from the United States Department of Agriculture (USDA) using an Axiom_Arachis array with 58K single-nucleotide polymorphisms (SNPs). We evaluated the evolutionary relationships among 384 peanut germplasms using a genome-wide association study (GWAS) of seed aspect ratio data processed by ImageJ software. In total, 14,030 filtered polymorphic SNPs were identified from the peanut 58K SNP array. We identified five SNPs with significant associations to seed aspect ratio on chromosomes Aradu.A09, Aradu.A10, Araip.B08, and Araip.B09. AX-177640219 on chromosome Araip.B08 was the most significantly associated marker in GAPIT and Regularization method. Phosphoenolpyruvate carboxylase (PEPC) was found among the eleven genes within a linkage disequilibrium (LD) of the significant SNPs on Araip.B08 and could have a strong causal effect in determining seed aspect ratio. The results of the present study provide information and methods that are useful for further genetic and genomic studies as well as molecular breeding programs in peanuts.


Asunto(s)
Arachis/genética , Genoma de Planta/genética , Fitomejoramiento , Sitios de Carácter Cuantitativo , Semillas/anatomía & histología , Arachis/crecimiento & desarrollo , Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento , Repeticiones de Microsatélite , Tamaño de los Órganos/genética , Fosfoenolpiruvato Carboxilasa/genética , Proteínas de Plantas/genética , Polimorfismo de Nucleótido Simple , Semillas/genética
4.
Plants (Basel) ; 9(9)2020 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-32932572

RESUMEN

Cowpea is one of the most essential legume crops providing inexpensive dietary protein and nutrients. The aim of this study was to understand the genetic diversity and population structure of global and Korean cowpea germplasms. A total of 384 cowpea accessions from 21 countries were genotyped with the Cowpea iSelect Consortium Array containing 51,128 single-nucleotide polymorphisms (SNPs). After SNP filtering, a genetic diversity study was carried out using 35,116 SNPs within 376 cowpea accessions, including 229 Korean accessions. Based on structure and principal component analysis, a total of 376 global accessions were divided into four major populations. Accessions in group 1 were from Asia and Europe, those in groups 2 and 4 were from Korea, and those in group 3 were from West Africa. In addition, 229 Korean accessions were divided into three major populations (Q1, Jeonra province; Q2, Gangwon province; Q3, a mixture of provinces). Additionally, the neighbor-joining tree indicated similar results. Further genetic diversity analysis within the global and Korean population groups indicated low heterozygosity, a low polymorphism information content, and a high inbreeding coefficient in the Korean cowpea accessions. The population structure analysis will provide useful knowledge to support the genetic potential of the cowpea breeding program, especially in Korea.

5.
J Med Internet Res ; 22(5): e16084, 2020 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-32369034

RESUMEN

BACKGROUND: Prognostic genes or gene signatures have been widely used to predict patient survival and aid in making decisions pertaining to therapeutic actions. Although some web-based survival analysis tools have been developed, they have several limitations. OBJECTIVE: Taking these limitations into account, we developed ESurv (Easy, Effective, and Excellent Survival analysis tool), a web-based tool that can perform advanced survival analyses using user-derived data or data from The Cancer Genome Atlas (TCGA). Users can conduct univariate analyses and grouped variable selections using multiomics data from TCGA. METHODS: We used R to code survival analyses based on multiomics data from TCGA. To perform these analyses, we excluded patients and genes that had insufficient information. Clinical variables were classified as 0 and 1 when there were two categories (for example, chemotherapy: no or yes), and dummy variables were used where features had 3 or more outcomes (for example, with respect to laterality: right, left, or bilateral). RESULTS: Through univariate analyses, ESurv can identify the prognostic significance for single genes using the survival curve (median or optimal cutoff), area under the curve (AUC) with C statistics, and receiver operating characteristics (ROC). Users can obtain prognostic variable signatures based on multiomics data from clinical variables or grouped variable selections (lasso, elastic net regularization, and network-regularized high-dimensional Cox-regression) and select the same outputs as above. In addition, users can create custom gene signatures for specific cancers using various genes of interest. One of the most important functions of ESurv is that users can perform all survival analyses using their own data. CONCLUSIONS: Using advanced statistical techniques suitable for high-dimensional data, including genetic data, and integrated survival analysis, ESurv overcomes the limitations of previous web-based tools and will help biomedical researchers easily perform complex survival analyses.


Asunto(s)
Neoplasias/genética , Análisis de Supervivencia , Humanos , Internet , Neoplasias/mortalidad , Pronóstico
6.
J Bioinform Comput Biol ; 18(1): 2050002, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32336254

RESUMEN

Gene set analysis aims to identify differentially expressed or co-expressed genes within a biological pathway between two experimental conditions, so that it can eventually reveal biological processes and pathways involved in disease development. In the last few decades, various statistical and computational methods have been proposed to improve statistical power of gene set analysis. In recent years, much attention has been paid to differentially co-expressed genes since they can be potentially disease-related genes without significant difference in average expression levels between two conditions. In this paper, we propose a new statistical method to identify differentially co-expressed genes from microarray gene expression data. The proposed method first estimates co-expression levels of paired genes using covariance regularization by thresholding, and then significance of difference in covariance estimation between two conditions is evaluated. We demonstrated that the proposed method is more powerful than the existing main-stream methods to detect co-expressed genes through extensive simulation studies. Also, we applied it to various microarray gene expression datasets related with mutant p53 transcriptional activity, and epithelium and stroma breast cancer.


Asunto(s)
Neoplasias de la Mama/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Neoplasias de la Mama/patología , Simulación por Computador , Femenino , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación Neoplásica de la Expresión Génica , Humanos , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Proteína p53 Supresora de Tumor/genética
7.
Nanomaterials (Basel) ; 10(1)2020 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-31936438

RESUMEN

M13 bacteriophage-based colorimetric sensors, especially multi-array sensors, have been successfully demonstrated to be a powerful platform for detecting extremely small amounts of target molecules. Colorimetric sensors can be fabricated easily using self-assembly of genetically engineered M13 bacteriophage which incorporates peptide libraries on its surface. However, the ability to discriminate many types of target molecules is still required. In this work, we introduce a statistical method to efficiently analyze a huge amount of numerical results in order to classify various types of target molecules. To enhance the selectivity of M13 bacteriophage-based colorimetric sensors, a multi-array sensor system can be an appropriate platform. On this basis, a pattern-recognizing multi-array biosensor platform was fabricated by integrating three types of sensors in which genetically engineered M13 bacteriophages (wild-, RGD-, and EEEE-type) were utilized as a primary building block. This sensor system was used to analyze a pattern of color change caused by a reaction between the sensor array and external substances, followed by separating the specific target substances by means of hierarchical cluster analysis. The biosensor platform could detect drug contaminants such as hormone drugs (estrogen) and antibiotics. We expect that the proposed biosensor system could be used for the development of a first-analysis kit, which would be inexpensive and easy to supply and could be applied in monitoring the environment and health care.

8.
J Med Genet ; 57(4): 217-225, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-31649053

RESUMEN

BACKGROUND: Pheochromocytoma and paraganglioma (PPGL) are tumours that arise from chromaffin cells. Some genetic mutations influence PPGL, among which, those in genes encoding subunits of succinate dehydrogenase (SDHA, SDHB, SDHC and SDHD) and assembly factor (SDHAF2) are the most relevant. However, the risk of metastasis posed by these mutations is not reported except for SDHB and SDHD mutations. This study aimed to update the metastatic risks, considering prevalence and incidence of each SDHx mutation, which were dealt formerly all together. METHODS: We searched EMBASE and MEDLINE and selected 27 articles. The patients included in the studies were divided into three groups depending on the presence of PPGL. We checked the heterogeneity between studies and performed a meta-analysis using Hartung-Knapp-Sidik-Jonkman method based on a random effect model. RESULTS: The highest PPGL prevalence was for SDHB mutation, ranging from 23% to 31%, and for SDHC mutation (23%), followed by that for SDHA mutation (16%). The lowest prevalence was for SDHD mutation, ranging from 6% to 8%. SDHAF2 mutation showed no metastatic events. The PPGL incidence showed a tendency similar to that of its prevalence with the highest risk of metastasis posed by SDHB mutation (12%-41%) and the lowest risk by SDHD mutation (~4%). CONCLUSION: There was no integrated evidence of how SDHx mutations are related to metastatic PPGL. However, these findings suggest that SDHA, SDHB and SDHC mutations are highly associated and should be tested as indicators of metastasis in patients with PPGL.


Asunto(s)
Complejo II de Transporte de Electrones/genética , Proteínas de la Membrana/genética , Paraganglioma/genética , Feocromocitoma/genética , Succinato Deshidrogenasa/genética , Neoplasias de las Glándulas Suprarrenales/genética , Neoplasias de las Glándulas Suprarrenales/patología , Mutación de Línea Germinal/genética , Heterocigoto , Humanos , Proteínas Mitocondriales/genética , Metástasis de la Neoplasia , Paraganglioma/patología , Feocromocitoma/patología
9.
BMC Bioinformatics ; 20(1): 510, 2019 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-31640538

RESUMEN

BACKGROUND: In human genetic association studies with high-dimensional gene expression data, it has been well known that statistical selection methods utilizing prior biological network knowledge such as genetic pathways and signaling pathways can outperform other methods that ignore genetic network structures in terms of true positive selection. In recent epigenetic research on case-control association studies, relatively many statistical methods have been proposed to identify cancer-related CpG sites and their corresponding genes from high-dimensional DNA methylation array data. However, most of existing methods are not designed to utilize genetic network information although methylation levels between linked genes in the genetic networks tend to be highly correlated with each other. RESULTS: We propose new approach that combines data dimension reduction techniques with network-based regularization to identify outcome-related genes for analysis of high-dimensional DNA methylation data. In simulation studies, we demonstrated that the proposed approach overwhelms other statistical methods that do not utilize genetic network information in terms of true positive selection. We also applied it to the 450K DNA methylation array data of the four breast invasive carcinoma cancer subtypes from The Cancer Genome Atlas (TCGA) project. CONCLUSIONS: The proposed variable selection approach can utilize prior biological network information for analysis of high-dimensional DNA methylation array data. It first captures gene level signals from multiple CpG sites using data a dimension reduction technique and then performs network-based regularization based on biological network graph information. It can select potentially cancer-related genes and genetic pathways that were missed by the existing methods.


Asunto(s)
Metilación de ADN , Epigenómica , Redes Reguladoras de Genes , Estudios de Asociación Genética , Neoplasias de la Mama/genética , Estudios de Casos y Controles , Simulación por Computador , Islas de CpG , Femenino , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos
10.
Leukemia ; 33(12): 2912-2923, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31138843

RESUMEN

A large body of evidence suggests that B-cell lymphomas with enhanced Myc expression are associated with an aggressive phenotype and poor prognosis, which makes Myc a compelling therapeutic target. Phosphodiesterase 4B (PDE4B), a main hydrolyzer of cyclic AMP (cAMP) in B cells, was shown to be involved in cell survival and drug resistance in diffuse large B cell lymphomas (DLBCL). However, the interrelationship between Myc and PDE4B remains unclear. Here, we first demonstrate the presence of the Myc-PDE4B feed-forward loop, in which Myc and PDE4B mutually reinforce the expression of each other. Next, the combined targeting of Myc and PDE4 synergistically prevented the proliferation and survival of B lymphoma cells in vitro and in a mouse xenograft model. We finally recapitulated this combinatorial effect in Eµ-myc transgenic mice; co-inhibition of Myc and PDE4 suppressed lymphomagenesis and restored B cell development to the wild type level that was associated with marked reduction in Myc levels, unveiling the critical role of the Myc-PDE4B amplification loop in the regulation of Myc expression and the pathogenesis of B cell lymphoma. These findings suggest that the disruption of the Myc-PDE4B circuitry can be exploited in the treatment of B cell malignancies.


Asunto(s)
Fosfodiesterasas de Nucleótidos Cíclicos Tipo 4/genética , Regulación Neoplásica de la Expresión Génica , Linfoma de Células B/genética , Linfoma de Células B/mortalidad , Proteínas Proto-Oncogénicas c-myc/genética , Animales , Biomarcadores de Tumor , Línea Celular Tumoral , AMP Cíclico/metabolismo , Fosfodiesterasas de Nucleótidos Cíclicos Tipo 4/metabolismo , Modelos Animales de Enfermedad , Humanos , Inmunohistoquímica , Linfoma de Células B/metabolismo , Linfoma de Células B/patología , Linfoma de Células B Grandes Difuso/genética , Linfoma de Células B Grandes Difuso/metabolismo , Linfoma de Células B Grandes Difuso/mortalidad , Linfoma de Células B Grandes Difuso/patología , Ratones Transgénicos , Pronóstico , Unión Proteica , Proteínas Proto-Oncogénicas c-myc/metabolismo
11.
Bone Res ; 6: 20, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30002945

RESUMEN

Free fatty acids (FFAs), which are elevated with metabolic syndrome, are considered the principal offender exerting lipotoxicity. Few previous studies have reported a causal relationship between FFAs and osteoarthritis pathogenesis. However, the molecular mechanism by which FFAs exert lipotoxicity and induce osteoarthritis remains largely unknown. We here observed that oleate at the usual clinical range does not exert lipotoxicity while oleate at high pathological ranges exerted lipotoxicity through apoptosis in articular chondrocytes. By investigating the differential effect of oleate at toxic and nontoxic concentrations, we revealed that lipid droplet (LD) accumulation confers articular chondrocytes, the resistance to lipotoxicity. Using high fat diet-induced osteoarthritis models and articular chondrocytes treated with oleate alone or oleate plus palmitate, we demonstrated that articular chondrocytes gain resistance to lipotoxicity through protein kinase casein kinase 2 (PKCK2)-six-transmembrane protein of prostate 2 (STAMP2)-and fat-specific protein 27 (FSP27)-mediated LD accumulation. We further observed that the exertion of FFAs-induced lipotoxicity was correlated with the increased concentration of cellular FFAs freed from LDs, whether FFAs are saturated or not. In conclusion, PKCK2/STAMP2/FSP27-mediated sequestration of FFAs in LD rescues osteoarthritic chondrocytes. PKCK2/STAMP2/FSP27 should be considered for interventions against metabolic OA.

12.
J Bioinform Comput Biol ; 16(4): 1850010, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29954287

RESUMEN

In genetic association studies, regularization methods are often used due to their computational efficiency for analysis of high-dimensional genomic data. DNA methylation data generated from Infinium HumanMethylation450 BeadChip Kit have a group structure where an individual gene consists of multiple Cytosine-phosphate-Guanine (CpG) sites. Consequently, group-based regularization can precisely detect outcome-related CpG sites. Representative examples are sparse group lasso (SGL) and network-based regularization. The former is powerful when most of the CpG sites within the same gene are associated with a phenotype outcome. In contrast, the latter is preferred when only a few of the CpG sites within the same gene are related to the outcome. In this paper, we propose new variable selection strategy based on a selection probability that measures selection frequency of individual variables selected by both SGL and network-based regularization. In extensive simulation study, we demonstrated that the proposed strategy can show relatively outstanding selection performance under any situation, compared with both SGL and network-based regularization. Also, we applied the proposed strategy to identify differentially methylated CpG sites and their corresponding genes from ovarian cancer data.


Asunto(s)
Biología Computacional/métodos , Metilación de ADN , Genética Humana/métodos , Neoplasias Ováricas/genética , Islas de CpG , Femenino , Humanos , Polimorfismo de Nucleótido Simple , Probabilidad
13.
Oncotarget ; 8(14): 23690-23701, 2017 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-28423593

RESUMEN

Hyper-activation of PAK1 (p21-activated kinase 1) is frequently observed in human cancer and speculated as a target of novel anti-tumor drug. In previous, we also showed that PAK1 is highly activated in the Smad4-deficient condition and suppresses PUMA (p53 upregulated modulator of apoptosis) through direct binding and phosphorylation. On the basis of this result, we have tried to find novel PAK1-PUMA binding inhibitors. Through ELISA-based blind chemical library screening, we isolated single compound, IPP-14 (IPP; Inhibitor of PAK1-PUMA), which selectively blocks the PAK1-PUMA binding and also suppresses cell proliferation via PUMA-dependent manner. Indeed, in PUMA-deficient cells, this chemical did not show anti-proliferating effect. This chemical possessed very strong PAK1 inhibition activity that it suppressed BAD (Bcl-2-asoociated death promoter) phosphorylation and meta-phase arrest via Aurora kinase inactivation in lower concentration than that of previous PAK1 kinase, FRAX486 and AG879. Moreover, our chemical obviously induced p21/WAF1/CIP1 (Cyclin-dependent kinase inhibitor 1A) expression by releasing from Bcl-2 (B-cell lymphoma-2) and by inhibition of AKT-mediated p21 suppression. Considering our result, IPP-14 and its derivatives would be possible candidates for PAK1 and p21 induction targeted anti-cancer drug.


Asunto(s)
Proteínas Reguladoras de la Apoptosis/biosíntesis , Inhibidor p21 de las Quinasas Dependientes de la Ciclina/metabolismo , Neoplasias/tratamiento farmacológico , Inhibidores de Proteínas Quinasas/farmacología , Proteínas Proto-Oncogénicas/biosíntesis , Quinasas p21 Activadas/antagonistas & inhibidores , Proteínas Reguladoras de la Apoptosis/metabolismo , Puntos de Control del Ciclo Celular/efectos de los fármacos , Muerte Celular/efectos de los fármacos , Línea Celular Tumoral , Movimiento Celular/efectos de los fármacos , Proliferación Celular/efectos de los fármacos , Niño , Femenino , Células HCT116 , Humanos , Neoplasias/enzimología , Neoplasias/metabolismo , Proteínas Proto-Oncogénicas/metabolismo , Proteínas Proto-Oncogénicas c-bcl-2/biosíntesis , Bibliotecas de Moléculas Pequeñas/farmacología , Quinasas p21 Activadas/metabolismo
14.
J Comput Biol ; 24(5): 400-411, 2017 May.
Artículo en Inglés | MEDLINE | ID: mdl-28281787

RESUMEN

In human genome research, genetic association studies of rare variants have been widely studied since the advent of high-throughput DNA sequencing platforms. However, detection of outcome-related rare variants still remains a statistically challenging problem because the number of observed genetic mutations is extremely rare. Recently, a power set-based statistical selection procedure has been proposed to locate both risk and protective rare variants within the outcome-related genes or genetic regions. Although it can perform an individual selection of rare variants, the procedure has a limitation that it cannot measure the certainty of selected rare variants. In this article, we propose a selection probability of individual rare variants, where selection frequencies of rare variants are computed based on bootstrap resampling. Therefore, it can quantify the certainty of both selected and unselected rare variants. Also, a new selection approach using a threshold of selection probability is introduced and compared with some existing selection procedures from extensive simulation studies and real sequencing data analysis. We have demonstrated that the proposed approach outperforms the existing methods in terms of a selection power.


Asunto(s)
Biología Computacional/métodos , Estudios de Asociación Genética/métodos , Variación Genética , Algoritmos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Modelos Genéticos , Modelos Estadísticos , Tasa de Mutación , Análisis de Secuencia de ADN
15.
Bioinformatics ; 33(12): 1765-1772, 2017 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-28165116

RESUMEN

MOTIVATION: DNA methylation plays an important role in many biological processes and cancer progression. Recent studies have found that there are also differences in methylation variations in different groups other than differences in methylation means. Several methods have been developed that consider both mean and variance signals in order to improve statistical power of detecting differentially methylated loci. Moreover, as methylation levels of neighboring CpG sites are known to be strongly correlated, methods that incorporate correlations have also been developed. We previously developed a network-based penalized logistic regression for correlated methylation data, but only focusing on mean signals. We have also developed a generalized exponential tilt model that captures both mean and variance signals but only examining one CpG site at a time. RESULTS: In this article, we proposed a penalized Exponential Tilt Model (pETM) using network-based regularization that captures both mean and variance signals in DNA methylation data and takes into account the correlations among nearby CpG sites. By combining the strength of the two models we previously developed, we demonstrated the superior power and better performance of the pETM method through simulations and the applications to the 450K DNA methylation array data of the four breast invasive carcinoma cancer subtypes from The Cancer Genome Atlas (TCGA) project. The developed pETM method identifies many cancer-related methylation loci that were missed by our previously developed method that considers correlations among nearby methylation loci but not variance signals. AVAILABILITY AND IMPLEMENTATION: The R package 'pETM' is publicly available through CRAN: http://cran.r-project.org . CONTACT: sw2206@columbia.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Islas de CpG , Metilación de ADN , Genómica/métodos , Programas Informáticos , Neoplasias de la Mama/genética , Femenino , Humanos , Modelos Logísticos , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia de ADN/métodos
16.
Oncotarget ; 7(23): 35144-58, 2016 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-27147573

RESUMEN

Stress has been suggested as one of important cause of human cancer without molecular biological evidence. Thus, we test the effect of stress-related hormones on cell viability and mitotic fidelity. Similarly to estrogen, stress hormone cortisol and its relative cortisone increase microtubule organizing center (MTOC) number through elevated expression of γ-tubulin and provide the Taxol resistance to human cancer cell lines. However, these effects are achieved by glucocorticoid hormone receptor (GR) but not by estrogen receptor (ER). Since ginsenosides possess steroid-like structure, we hypothesized that it would block the stress or estrogen-induced MTOC amplification and Taxol resistance. Among tested chemicals, rare ginsenoside, CSH1 (Rg6) shows obvious effect on inhibition of MTOC amplification, γ-tubulin induction and Taxol resistance. Comparing to Fulvestant (FST), ER-α specific inhibitor, this chemical can block the cortisol/cortisone-induced MTOC deregulation as well as ER-α signaling. Our results suggest that stress hormone induced tumorigenesis would be achieved by MTOC amplification, and CSH1 would be useful for prevention of stress-hormone or steroid hormone-induced chromosomal instability.


Asunto(s)
Cortisona/farmacología , Ginsenósidos/farmacología , Hidrocortisona/farmacología , Centro Organizador de los Microtúbulos/efectos de los fármacos , Estrés Psicológico/complicaciones , Línea Celular Tumoral , Humanos , Paclitaxel/farmacología , Estrés Psicológico/metabolismo , Estrés Psicológico/patología
17.
Genomics Inform ; 14(4): 187-195, 2016 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-28154510

RESUMEN

In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's T2 test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator) in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.

18.
J Comput Biol ; 22(11): 1034-43, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26469994

RESUMEN

In genetic association studies with deep sequencing data, it is a challenging statistical problem to precisely locate rare variants associated with complex diseases or traits due to the limited number of observed genetic mutations. In particular, both risk and protective rare variants can be present in the same gene or genetic region. There currently exist very few statistical methods to separate casual rare variants from noncausal variants within a disease/trait-related gene or a genetic region, while there are relatively many statistical tests to detect a phenotypic association of a group of rare variants such as a gene or a genetic region. In this article, we propose a new statistical selection strategy that is able to locate causal rare variants within the disease/trait-related gene or a genetic region. The proposed procedure is to linearly combine potential risk and protective variants in order to find the optimal combination of rare variants that can have the strongest association signal. It is also computationally very efficient since the procedure is based on forward selection. In simulation studies we demonstrate that the selection performance of the proposed procedure is more powerful than other existing methods when both risk and protective variants are present. We also applied it to the real sequencing data on the ANGPTL gene family from the Dallas Heart Study.


Asunto(s)
Interpretación Estadística de Datos , Algoritmos , Frecuencia de los Genes , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Herencia Multifactorial , Factores Protectores , Factores de Riesgo
19.
Bioinformatics ; 30(16): 2317-23, 2014 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-24755303

RESUMEN

MOTIVATION: Existing association methods for rare variants from sequencing data have focused on aggregating variants in a gene or a genetic region because of the fact that analysing individual rare variants is underpowered. However, these existing rare variant detection methods are not able to identify which rare variants in a gene or a genetic region of all variants are associated with the complex diseases or traits. Once phenotypic associations of a gene or a genetic region are identified, the natural next step in the association study with sequencing data is to locate the susceptible rare variants within the gene or the genetic region. RESULTS: In this article, we propose a power set-based statistical selection procedure that is able to identify the locations of the potentially susceptible rare variants within a disease-related gene or a genetic region. The selection performance of the proposed selection procedure was evaluated through simulation studies, where we demonstrated the feasibility and superior power over several comparable existing methods. In particular, the proposed method is able to handle the mixed effects when both risk and protective variants are present in a gene or a genetic region. The proposed selection procedure was also applied to the sequence data on the ANGPTL gene family from the Dallas Heart Study to identify potentially susceptible rare variants within the trait-related genes. AVAILABILITY AND IMPLEMENTATION: An R package 'rvsel' can be downloaded from http://www.columbia.edu/∼sw2206/ and http://statsun.pusan.ac.kr.


Asunto(s)
Estudios de Asociación Genética/métodos , Variación Genética , Análisis de Secuencia de ADN/métodos , Proteína 4 Similar a la Angiopoyetina , Angiopoyetinas/genética , Interpretación Estadística de Datos , Metabolismo Energético/genética , Humanos , Lipoproteínas VLDL/análisis , Fenotipo , Triglicéridos/análisis
20.
Stat Sin ; 24(3): 1433-1459, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-26316678

RESUMEN

We consider estimation and variable selection in high-dimensional Cox regression when a prior knowledge of the relationships among the covariates, described by a network or graph, is available. A limitation of the existing methodology for survival analysis with high-dimensional genomic data is that a wealth of structural information about many biological processes, such as regulatory networks and pathways, has often been ignored. In order to incorporate such prior network information into the analysis of genomic data, we propose a network-based regularization method for high-dimensional Cox regression; it uses an ℓ1-penalty to induce sparsity of the regression coefficients and a quadratic Laplacian penalty to encourage smoothness between the coefficients of neighboring variables on a given network. The proposed method is implemented by an efficient coordinate descent algorithm. In the setting where the dimensionality p can grow exponentially fast with the sample size n, we establish model selection consistency and estimation bounds for the proposed estimators. The theoretical results provide insights into the gain from taking into account the network structural information. Extensive simulation studies indicate that our method outperforms Lasso and elastic net in terms of variable selection accuracy and stability. We apply our method to a breast cancer gene expression study and identify several biologically plausible subnetworks and pathways that are associated with breast cancer distant metastasis.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...