Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Methods ; 111: 56-63, 2016 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-27480381

RESUMO

Hepatitis B viral (HBV) infection is strongly associated with an increased risk of liver diseases like cirrhosis or hepatocellular carcinoma (HCC). Many lines of evidence suggest that deletions occurring in HBV genomic DNA are highly associated with the activity of HBV via the interplay between aberrant viral proteins release and human immune system. Deletions finding on the HBV whole genome sequences is thus a very important issue though there exist underlying the challenges in mining such big and complex biological data. Although some next generation sequencing (NGS) tools are recently designed for identifying structural variations such as insertions or deletions, their validity is generally committed to human sequences study. This design may not be suitable for viruses due to different species. We propose a graphics processing unit (GPU)-based data mining method called DeF-GPU to efficiently and precisely identify HBV deletions from large NGS data, which generally contain millions of reads. To fit the single instruction multiple data instructions, sequencing reads are referred to as multiple data and the deletion finding procedure is referred to as a single instruction. We use Compute Unified Device Architecture (CUDA) to parallelize the procedures, and further validate DeF-GPU on 5 synthetic and 1 real datasets. Our results suggest that DeF-GPU outperforms the existing commonly-used method Pindel and is able to exactly identify the deletions of our ground truth in few seconds. The source code and other related materials are available at https://sourceforge.net/projects/defgpu/.


Assuntos
Biologia Computacional/métodos , Genoma Viral/genética , Vírus da Hepatite B/genética , Hepatite B/genética , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/virologia , DNA Viral/genética , Hepatite B/virologia , Vírus da Hepatite B/patogenicidade , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/virologia , Deleção de Sequência/genética , Software
2.
Hepatol Int ; 10(1): 147-57, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26208819

RESUMO

BACKGROUND: Hepatitis B virus (HBV) quasispecies are crucial in the pathogenesis of chronic liver disease. Next-generation sequencing (NGS) is powerful for identifying viral quasispecies. To improve mapping quality and single nucleotide variant (SNV) calling accuracy in the NGS analysis of HBV, we compared different mapping references, including the sample-specific reference sequence, same genotype sequences and different genotype sequences, according to the sample. METHODS: Real Illumina HBV datasets from 86 patients, and simulated datasets from 158 HBV strains in the GenBank database, were used to assess mapping quality. SNV calling accuracy was evaluated using different mapping references to align Real Illumina datasets from a single HBV clone. RESULTS: Using the sample-specific reference sequence as a mapping reference produced the largest number of mappable reads and coverages. With a different genotype mapping reference, the consensus sequence derived from the Real Illumina datasets of the single HBV clone showed 21 false SNV callings in polymerase and surface genes, the regions most divergent between the mapping reference and this HBV clone. A ~6 % coverage of most of these false SNVs was yielded even with a same genotype mapping reference, but none with the sample-specific reference sequence. CONCLUSIONS: Using sample-specific reference sequences as a mapping reference in NGS analysis optimized mapping quality and the SNV calling accuracy for HBV quasispecies.


Assuntos
Vírus da Hepatite B/genética , Hepatite B Crônica/virologia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Adulto , Idoso , Sequência de Bases , DNA Viral/análise , Feminino , Genótipo , Vírus da Hepatite B/classificação , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Valores de Referência , Alinhamento de Sequência , Análise de Sequência de DNA/normas
3.
Nucleic Acids Res ; 43(3): 1593-608, 2015 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-25609695

RESUMO

Overexpression of Oct4, a stemness gene encoding a transcription factor, has been reported in several cancers. However, the mechanism by which Oct4 directs transcriptional program that leads to somatic cancer progression remains unclear. In this study, we provide mechanistic insight into Oct4-driven transcriptional network promoting drug-resistance and metastasis in lung cancer cell, animal and clinical studies. Through an integrative approach combining our Oct4 chromatin-immunoprecipitation sequencing and ENCODE datasets, we identified the genome-wide binding regions of Oct4 in lung cancer at promoter and enhancer of numerous genes involved in critical pathways which promote tumorigenesis. Notably, PTEN and TNC were previously undefined targets of Oct4. In addition, novel Oct4-binding motifs were found to overlap with DNA elements for Sp1 transcription factor. We provided evidence that Oct4 suppressed PTEN in an Sp1-dependent manner by recruitment of HDAC1/2, leading to activation of AKT signaling and drug-resistance. In contrast, Oct4 transactivated TNC independent of Sp1 and resulted in cancer metastasis. Clinically, lung cancer patients with Oct4 high, PTEN low and TNC high expression profile significantly correlated with poor disease-free survival. Our study reveals a critical Oct4-driven transcriptional program that promotes lung cancer progression, illustrating the therapeutic potential of targeting Oc4 transcriptionally regulated genes.


Assuntos
Resistencia a Medicamentos Antineoplásicos/genética , Neoplasias Pulmonares/genética , Metástase Neoplásica/genética , Fator 3 de Transcrição de Octâmero/genética , PTEN Fosfo-Hidrolase/genética , Tenascina/genética , Linhagem Celular Tumoral , Imunoprecipitação da Cromatina , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/patologia , Reação em Cadeia da Polimerase , Proteínas Proto-Oncogênicas c-akt/metabolismo , Transdução de Sinais , Transcrição Gênica
4.
Bioinformatics ; 30(21): 3054-61, 2014 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-25015989

RESUMO

MOTIVATION: A rapid progression of esophageal squamous cell carcinoma (ESCC) causes a high mortality rate because of the propensity for metastasis driven by genetic and epigenetic alterations. The identification of prognostic biomarkers would help prevent or control metastatic progression. Expression analyses have been used to find such markers, but do not always validate in separate cohorts. Epigenetic marks, such as DNA methylation, are a potential source of more reliable and stable biomarkers. Importantly, the integration of both expression and epigenetic alterations is more likely to identify relevant biomarkers. RESULTS: We present a new analysis framework, using ESCC progression-associated gene regulatory network (GRN escc), to identify differentially methylated CpG sites prognostic of ESCC progression. From the CpG loci differentially methylated in 50 tumor-normal pairs, we selected 44 CpG loci most highly associated with survival and located in the promoters of genes more likely to belong to GRN escc. Using an independent ESCC cohort, we confirmed that 8/10 of CpG loci in the promoter of GRN escc genes significantly correlated with patient survival. In contrast, 0/10 CpG loci in the promoter genes outside the GRN escc were correlated with patient survival. We further characterized the GRN escc network topology and observed that the genes with methylated CpG loci associated with survival deviated from the center of mass and were less likely to be hubs in the GRN escc. We postulate that our analysis framework improves the identification of bona fide prognostic biomarkers from DNA methylation studies, especially with partial genome coverage.


Assuntos
Carcinoma de Células Escamosas/genética , Metilação de DNA , Epigênese Genética , Neoplasias Esofágicas/genética , Redes Reguladoras de Genes , Biomarcadores Tumorais/metabolismo , Carcinoma de Células Escamosas/mortalidade , Ilhas de CpG , Progressão da Doença , Neoplasias Esofágicas/mortalidade , Carcinoma de Células Escamosas do Esôfago , Humanos , Regiões Promotoras Genéticas
5.
BMC Bioinformatics ; 15: 173, 2014 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-24909518

RESUMO

BACKGROUND: Human disease often arises as a consequence of alterations in a set of associated genes rather than alterations to a set of unassociated individual genes. Most previous microarray-based meta-analyses identified disease-associated genes or biomarkers independent of genetic interactions. Therefore, in this study, we present the first meta-analysis method capable of taking gene combination effects into account to efficiently identify associated biomarkers (ABs) across different microarray platforms. RESULTS: We propose a new meta-analysis approach called MiningABs to mine ABs across different array-based datasets. The similarity between paired probe sequences is quantified as a bridge to connect these datasets together. The ABs can be subsequently identified from an "improved" common logit model (c-LM) by combining several sibling-like LMs in a heuristic genetic algorithm selection process. Our approach is evaluated with two sets of gene expression datasets: i) 4 esophageal squamous cell carcinoma and ii) 3 hepatocellular carcinoma datasets. Based on an unbiased reciprocal test, we demonstrate that each gene in a group of ABs is required to maintain high cancer sample classification accuracy, and we observe that ABs are not limited to genes common to all platforms. Investigating the ABs using Gene Ontology (GO) enrichment, literature survey, and network analyses indicated that our ABs are not only strongly related to cancer development but also highly connected in a diverse network of biological interactions. CONCLUSIONS: The proposed meta-analysis method called MiningABs is able to efficiently identify ABs from different independently performed array-based datasets, and we show its validity in cancer biology via GO enrichment, literature survey and network analyses. We postulate that the ABs may facilitate novel target and drug discovery, leading to improved clinical treatment. Java source code, tutorial, example and related materials are available at "http://sourceforge.net/projects/miningabs/".


Assuntos
Mineração de Dados/métodos , Perfilação da Expressão Gênica , Expressão Gênica , Marcadores Genéticos/genética , Algoritmos , Biomarcadores Tumorais/genética , Perfilação da Expressão Gênica/métodos , Humanos , Neoplasias/genética
6.
BMC Bioinformatics ; 14 Suppl 12: S3, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24267918

RESUMO

BACKGROUND: Observation of gene expression changes implying gene regulations using a repetitive experiment in time course has become more and more important. However, there is no effective method which can handle such kind of data. For instance, in a clinical/biological progression like inflammatory response or cancer formation, a great number of differentially expressed genes at different time points could be identified through a large-scale microarray approach. For each repetitive experiment with different samples, converting the microarray datasets into transactional databases with significant singleton genes at each time point would allow sequential patterns implying gene regulations to be identified. Although traditional sequential pattern mining methods have been successfully proposed and widely used in different interesting topics, like mining customer purchasing sequences from a transactional database, to our knowledge, the methods are not suitable for such biological dataset because every transaction in the converted database may contain too many items/genes. RESULTS: In this paper, we propose a new algorithm called CTGR-Span (Cross-Timepoint Gene Regulation Sequential pattern) to efficiently mine CTGR-SPs (Cross-Timepoint Gene Regulation Sequential Patterns) even on larger datasets where traditional algorithms are infeasible. The CTGR-Span includes several biologically designed parameters based on the characteristics of gene regulation. We perform an optimal parameter tuning process using a GO enrichment analysis to yield CTGR-SPs more meaningful biologically. The proposed method was evaluated with two publicly available human time course microarray datasets and it was shown that it outperformed the traditional methods in terms of execution efficiency. After evaluating with previous literature, the resulting patterns also strongly correlated with the experimental backgrounds of the datasets used in this study. CONCLUSIONS: We propose an efficient CTGR-Span to mine several biologically meaningful CTGR-SPs. We postulate that the biologist can benefit from our new algorithm since the patterns implying gene regulations could provide further insights into the mechanisms of novel gene regulations during a biological or clinical progression. The Java source code, program tutorial and other related materials used in this program are available at http://websystem.csie.ncku.edu.tw/CTGR-Span.rar.


Assuntos
Algoritmos , Mineração de Dados , Regulação da Expressão Gênica , Análise por Conglomerados , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
7.
BMC Bioinformatics ; 14: 230, 2013 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-23870110

RESUMO

BACKGROUND: Frequent pattern mining analysis applied on microarray dataset appears to be a promising strategy for identifying relationships between gene expression levels. Unfortunately, too many itemsets (co-expressed genes) are identified by this analysis method since it does not consider the importance of each gene within biological processes to a cellular response and does not take into account temporal properties under biological treatment-control matched conditions in a microarray dataset. RESULTS: We propose a method termed TIIM (Top-k Impactful Itemsets Miner), which only requires specifying a user-defined number k to explore the top k itemsets with the most significantly differentially co-expressed genes between 2 conditions in a time course. To give genes different weights, a table with impact degrees for each gene was constructed based on the number of neighboring genes that are differently expressed in the dataset within gene regulatory networks. Finally, the resulting top-k impactful itemsets were manually evaluated using previous literature and analyzed by a Gene Ontology enrichment method. CONCLUSIONS: In this study, the proposed method was evaluated in 2 publicly available time course microarray datasets with 2 different experimental conditions. Both datasets identified potential itemsets with co-expressed genes evaluated from the literature and showed higher accuracies compared to the 2 corresponding control methods: i) performing TIIM without considering the gene expression differentiation between 2 different experimental conditions and impact degrees, and ii) performing TIIM with a constant impact degree for each gene. Our proposed method found that several new gene regulations involved in these itemsets were useful for biologists and provided further insights into the mechanisms underpinning biological processes. The Java source code and other related materials used in this study are available at "http://websystem.csie.ncku.edu.tw/TIIM_Program.rar".


Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Algoritmos , Bases de Dados Factuais , Perfilação da Expressão Gênica/métodos , Genes , Análise de Sequência com Séries de Oligonucleotídeos/métodos
8.
PLoS One ; 7(2): e32553, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22384271

RESUMO

Hepatitis B virus (HBV) is one of the most common DNA viruses that can cause aggressive hepatitis, cirrhosis and hepatocellular carcinoma. Although many people are persistently infected with HBV, the kinetics in serum levels of viral loads and the host immune responses vary from person to person. HBV precore/core open reading frame (ORF) encoding proteins, hepatitis B e antigen (HBeAg) and core antigen (HBcAg), are two indicators of active viral replication. The aim of this study was to discover a variety of amino acid covariances in responses to viral kinetics, seroconversion and genotypes during the course of HBV infection. A one year follow-up study was conducted with a total number of 1,694 clones from 23 HBeAg-positive chronic hepatitis B patients. Serum alanine aminotransferase, HBV DNA and HBeAg levels were measured monthly as criteria for clustering patients into several different subgroups. Monthly derived multiple precore/core ORFs were directly sequenced and translated into amino acid sequences. For each subgroup, time-dependent covariances were identified from their time-varying sequences over the entire follow-up period. The fluctuating, wavering, HBeAg-nonseroconversion and genotype C subgroups showed greater degrees of covariances than the stationary, declining, HBeAg-seroconversion and genotype B. Referring to literature, mutation hotspots within our identified covariances were associated with the infection process. Remarkably, hotspots were predominant in genotype C. Moreover, covariances were also identified at early stage (spanning from baseline to a peak of serum HBV DNA) in order to determine the intersections with aforementioned time-dependent covariances. Preserved covariances, namely representative covariances, of each subgroup are visually presented using a tree-based structure. Our results suggested that identified covariances were strongly associated with viral kinetics, seroconversion and genotypes. Moreover, representative covariances may benefit clinicians to prescribe a suitable treatment for patients even if they have no obvious symptoms at the early stage of HBV infection.


Assuntos
Antígenos do Núcleo do Vírus da Hepatite B/genética , Antígenos E da Hepatite B/química , Vírus da Hepatite B/química , Hepatite B/virologia , Adulto , Alanina Transaminase/sangue , Algoritmos , DNA Viral/genética , Feminino , Genótipo , Antígenos do Núcleo do Vírus da Hepatite B/química , Humanos , Cinética , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Fases de Leitura Aberta , Reprodutibilidade dos Testes , Risco
9.
Neurobiol Aging ; 33(2): 422.e11-25, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21112127

RESUMO

The CCAAT/enhancer binding protein delta (CEBPD, C/EBPδ, NF-IL6ß) is induced in many inflammation-related diseases, suggesting that CEBPD and its downstream targets may play central roles in these conditions. Neuropathological studies show that a neuroinflammatory response parallels the early stages of Alzheimer's disease (AD). However, the precise mechanistic correlation between inflammation and AD pathogenesis remains unclear. CEBPD is upregulated in the astrocytes of AD patients. Therefore, we asked if activation of astrocytic CEBPD could contribute to AD pathogenesis. In this report, a novel role of CEBPD in attenuating macrophage-mediated phagocytosis of damaged neuron cells was found. By global gene expression profiling, we identified the inflammatory marker pentraxin-3 (PTX3, TNFAIP5, TSG-14) as a CEBPD target in astrocytes. Furthermore, we demonstrate that PTX3 participates in the attenuation of macrophage-mediated phagocytosis of damaged neuron cells. This study provides the first demonstration of a role for astrocytic CEBPD and the CEBPD-regulated molecule PTX3 in the accumulation of damaged neurons, which is a hallmark of AD pathogenesis.


Assuntos
Astrócitos/metabolismo , Proteína C-Reativa/metabolismo , Proteína delta de Ligação ao Facilitador CCAAT/metabolismo , Macrófagos/fisiologia , Neurônios/citologia , Neurônios/metabolismo , Fagocitose/fisiologia , Componente Amiloide P Sérico/metabolismo , Apoptose/fisiologia , Comunicação Celular/fisiologia , Linhagem Celular , Humanos , Regulação para Cima/fisiologia
10.
Bioinformatics ; 27(22): 3142-8, 2011 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-21926125

RESUMO

MOTIVATION: Association rule analysis methods are important techniques applied to gene expression data for finding expression relationships between genes. However, previous methods implicitly assume that all genes have similar importance, or they ignore the individual importance of each gene. The relation intensity between any two items has never been taken into consideration. Therefore, we proposed a technique named REMMAR (RElational-based Multiple Minimum supports Association Rules) algorithm to tackle this problem. This method adjusts the minimum relation support (MRS) for each gene pair depending on the regulatory relation intensity to discover more important association rules with stronger biological meaning. RESULTS: In the actual case study of this research, REMMAR utilized the shortest distance between any two genes in the Saccharomyces cerevisiae gene regulatory network (GRN) as the relation intensity to discover the association rules from two S.cerevisiae gene expression datasets. Under experimental evaluation, REMMAR can generate more rules with stronger relation intensity, and filter out rules without biological meaning in the protein-protein interaction network (PPIN). Furthermore, the proposed method has a higher precision (100%) than the precision of reference Apriori method (87.5%) for the discovered rules use a literature survey. Therefore, the proposed REMMAR algorithm can discover stronger association rules in biological relationships dissimilated by traditional methods to assist biologists in complicated genetic exploration.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Mapas de Interação de Proteínas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
11.
Cancer Res ; 70(1): 192-201, 2010 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-19996286

RESUMO

The BRCA1-interacted transcriptional repressor ZBRK1 has been associated with antiangiogenesis, but direct evidence of a tumor suppressor role has been lacking. In this study, we provide evidence of such a role in cervical carcinoma. ZBRK1 levels in cervical tumor cells were significantly lower than in normal cervical epithelial cells. In HeLa cervical cancer cells, enforced expression inhibited malignant growth, invasion, and metastasis in a variety of in vitro and in vivo assays. Expression of the metalloproteinase MMP9, which is known to be an important driver of invasion and metastasis, was found to be inversely correlated with ZBRK1 in tumor tissues and a target for repression in tumor cells. Our findings suggest that ZBRK1 acts to inhibit metastasis of cervical carcinoma, perhaps by modulating MMP9 expression.


Assuntos
Regulação Neoplásica da Expressão Gênica/genética , Metaloproteinase 9 da Matriz/genética , Invasividade Neoplásica/genética , Proteínas Repressoras/genética , Neoplasias do Colo do Útero/genética , Animais , Linhagem Celular Tumoral , Movimento Celular/genética , Proliferação de Células , Feminino , Genes Supressores de Tumor , Humanos , Imunoprecipitação , Metaloproteinase 9 da Matriz/biossíntese , Camundongos , Camundongos Nus , Proteínas Repressoras/biossíntese , Reação em Cadeia da Polimerase Via Transcriptase Reversa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA