Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 119
Filtrar
1.
Cell ; 165(2): 357-71, 2016 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-27058666

RESUMO

We report a mechanism through which the transcription machinery directly controls topoisomerase 1 (TOP1) activity to adjust DNA topology throughout the transcription cycle. By comparing TOP1 occupancy using chromatin immunoprecipitation sequencing (ChIP-seq) versus TOP1 activity using topoisomerase 1 sequencing (TOP1-seq), a method reported here to map catalytically engaged TOP1, TOP1 bound at promoters was discovered to become fully active only after pause-release. This transition coupled the phosphorylation of the carboxyl-terminal-domain (CTD) of RNA polymerase II (RNAPII) with stimulation of TOP1 above its basal rate, enhancing its processivity. TOP1 stimulation is strongly dependent on the kinase activity of BRD4, a protein that phosphorylates Ser2-CTD and regulates RNAPII pause-release. Thus the coordinated action of BRD4 and TOP1 overcame the torsional stress opposing transcription as RNAPII commenced elongation but preserved negative supercoiling that assists promoter melting at start sites. This nexus between transcription and DNA topology promises to elicit new strategies to intercept pathological gene expression.


Assuntos
DNA Topoisomerases Tipo I/metabolismo , DNA/metabolismo , RNA Polimerase II/metabolismo , Transcrição Gênica , DNA/química , DNA Topoisomerases Tipo I/genética , Técnicas de Silenciamento de Genes , Humanos , Regiões Promotoras Genéticas , RNA Polimerase II/química , RNA Polimerase II/isolamento & purificação , Elongação da Transcrição Genética , Fatores de Transcrição/isolamento & purificação , Sítio de Iniciação de Transcrição
2.
Cell ; 153(5): 988-99, 2013 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-23706737

RESUMO

Lymphocyte activation is initiated by a global increase in messenger RNA synthesis. However, the mechanisms driving transcriptome amplification during the immune response are unknown. By monitoring single-stranded DNA genome wide, we show that the genome of naive cells is poised for rapid activation. In G0, ∼90% of promoters from genes to be expressed in cycling lymphocytes are polymerase loaded but unmelted and support only basal transcription. Furthermore, the transition from abortive to productive elongation is kinetically limiting, causing polymerases to accumulate nearer to transcription start sites. Resting lymphocytes also limit the expression of the transcription factor IIH complex, including XPB and XPD helicases involved in promoter melting and open complex extension. To date, two rate-limiting steps have been shown to control global gene expression in eukaryotes: preinitiation complex assembly and polymerase pausing. Our studies identify promoter melting as a third key regulatory step and propose that this mechanism ensures a prompt lymphocyte response to invading pathogens.


Assuntos
Linfócitos B/metabolismo , Regulação da Expressão Gênica , Ativação Linfocitária , Linfócitos/metabolismo , Regiões Promotoras Genéticas , Animais , Linfócitos B/imunologia , Linhagem Celular Tumoral , DNA de Cadeia Simples/metabolismo , Elementos Facilitadores Genéticos , Estudo de Associação Genômica Ampla , Humanos , Linfócitos/citologia , Linfócitos/imunologia , Camundongos , Fator de Transcrição TFIIH/metabolismo , Transcrição Gênica
3.
PLoS Comput Biol ; 19(9): e1011472, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37721939

RESUMO

There is a growing awareness that tumor-adjacent normal tissues used as control samples in cancer studies do not represent fully healthy tissues. Instead, they are intermediates between healthy tissues and tumors. The factors that contribute to the deviation of such control samples from healthy state include exposure to the tumor-promoting factors, tumor-related immune response, and other aspects of tumor microenvironment. Characterizing the relation between gene expression of tumor-adjacent control samples and tumors is fundamental for understanding roles of microenvironment in tumor initiation and progression, as well as for identification of diagnostic and prognostic biomarkers for cancers. To address the demand, we developed and validated TranNet, a computational approach that utilizes gene expression in matched control and tumor samples to study the relation between their gene expression profiles. TranNet infers a sparse weighted bipartite graph from gene expression profiles of matched control samples to tumors. The results allow us to identify predictors (potential regulators) of this transition. To our knowledge, TranNet is the first computational method to infer such dependencies. We applied TranNet to the data of several cancer types and their matched control samples from The Cancer Genome Atlas (TCGA). Many predictors identified by TranNet are genes associated with regulation by the tumor microenvironment as they are enriched in G-protein coupled receptor signaling, cell-to-cell communication, immune processes, and cell adhesion. Correspondingly, targets of inferred predictors are enriched in pathways related to tissue remodelling (including the epithelial-mesenchymal Transition (EMT)), immune response, and cell proliferation. This implies that the predictors are markers and potential stromal facilitators of tumor progression. Our results provide new insights into the relationships between tumor adjacent control sample, tumor and the tumor environment. Moreover, the set of predictors identified by TranNet will provide a valuable resource for future investigations.


Assuntos
Neoplasias , Humanos , Neoplasias/metabolismo , Transcriptoma , Comunicação Celular , Transformação Celular Neoplásica , Microambiente Tumoral , Biomarcadores Tumorais/genética
4.
Bioinformatics ; 36(8): 2572-2574, 2020 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-31882996

RESUMO

SUMMARY: Large-scale data analysis in bioinformatics requires pipelined execution of multiple software. Generally each stage in a pipeline takes considerable computing resources and several workflow management systems (WMS), e.g. Snakemake, Nextflow, Common Workflow Language, Galaxy, etc. have been developed to ensure optimum execution of the stages across two invocations of the pipeline. However, when the pipeline needs to be executed with different settings of parameters, e.g. thresholds, underlying algorithms, etc. these WMS require significant scripting to ensure an optimal execution. We developed JUDI on top of DoIt, a Python based WMS, to systematically handle parameter settings based on the principles of database management systems. Using a novel modular approach that encapsulates a parameter database in each task and file associated with a pipeline stage, JUDI simplifies plug-and-play of the pipeline stages. For a typical pipeline with n parameters, JUDI reduces the number of lines of scripting required by a factor of O(n). With properly designed parameter databases, JUDI not only enables reproducing research under published values of parameters but also facilitates exploring newer results under novel parameter settings. AVAILABILITY AND IMPLEMENTATION: https://github.com/ncbi/JUDI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Software , Algoritmos , Idioma , Fluxo de Trabalho
5.
Nature ; 528(7580): 142-6, 2015 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-26605532

RESUMO

DNase I hypersensitive sites (DHSs) provide important information on the presence of transcriptional regulatory elements and the state of chromatin in mammalian cells. Conventional DNase sequencing (DNase-seq) for genome-wide DHSs profiling is limited by the requirement of millions of cells. Here we report an ultrasensitive strategy, called single-cell DNase sequencing (scDNase-seq) for detection of genome-wide DHSs in single cells. We show that DHS patterns at the single-cell level are highly reproducible among individual cells. Among different single cells, highly expressed gene promoters and enhancers associated with multiple active histone modifications display constitutive DHS whereas chromatin regions with fewer histone modifications exhibit high variation of DHS. Furthermore, the single-cell DHSs predict enhancers that regulate cell-specific gene expression programs and the cell-to-cell variations of DHS are predictive of gene expression. Finally, we apply scDNase-seq to pools of tumour cells and pools of normal cells, dissected from formalin-fixed paraffin-embedded tissue slides from patients with thyroid cancer, and detect thousands of tumour-specific DHSs. Many of these DHSs are associated with promoters and enhancers critically involved in cancer development. Analysis of the DHS sequences uncovers one mutation (chr18: 52417839G>C) in the tumour cells of a patient with follicular thyroid carcinoma, which affects the binding of the tumour suppressor protein p53 and correlates with decreased expression of its target gene TXNL1. In conclusion, scDNase-seq can reliably detect DHSs in single cells, greatly extending the range of applications of DHS analysis both for basic and for translational research, and may provide critical information for personalized medicine.


Assuntos
Cromatina/genética , Cromatina/metabolismo , Desoxirribonuclease I/metabolismo , Formaldeído , Genoma/genética , Inclusão em Parafina , Análise de Célula Única/métodos , Fixação de Tecidos , Adenocarcinoma Folicular/genética , Adenocarcinoma Folicular/patologia , Animais , Elementos Facilitadores Genéticos/genética , Perfilação da Expressão Gênica , Histonas/metabolismo , Humanos , Camundongos , Mutação/genética , Células NIH 3T3 , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Tiorredoxinas/genética , Neoplasias da Glândula Tireoide/genética , Neoplasias da Glândula Tireoide/patologia , Proteína Supressora de Tumor p53/metabolismo
6.
Nucleic Acids Res ; 47(13): 6632-6641, 2019 07 26.
Artigo em Inglês | MEDLINE | ID: mdl-31226207

RESUMO

Understanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the following compelling question is yet to be considered: in the absence of any sequence similarity to the binding motif, can DNA shape still increase binding probability? To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF-DNA binding. Specifically, Co-SELECT leverages the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allows Co-SELECT to detect an evidence for the role of DNA shape features in TF binding. Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to bind to DNA molecules of the shape consistent with the motif specific binding. This provides the first direct evidence that shape features that accompany the preferred sequence motifs also bestow an advantage for weak, sequence non-specific binding.


Assuntos
Aptâmeros de Nucleotídeos/química , Conformação de Ácido Nucleico , Técnica de Seleção de Aptâmeros/métodos , Fatores de Transcrição/metabolismo , Aptâmeros de Nucleotídeos/isolamento & purificação , Aptâmeros de Nucleotídeos/metabolismo , Conjuntos de Dados como Assunto , Ligação Proteica , Relação Estrutura-Atividade
7.
Nucleic Acids Res ; 46(16): 8133-8142, 2018 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-29986050

RESUMO

RNA-based therapeutics, i.e. the utilization of synthetic RNA molecules to alter cellular functions, have the potential to address targets which are currently out of scope for traditional drug design pipelines. This potential however hinges on the ability to selectively deliver and internalize therapeutic RNAs into cells of interest. Cell internalizing RNA aptamers selected against surface receptors and discriminatively expressed on target cells hold particular promise as suitable candidates for such delivery agents. Specifically, these aptamers can be combined with a therapeutic cargo and facilitate internalization of the cargo into the cell of interest. A recently proposed method to obtain such aptamer-cargo constructs employs a double-stranded "sticky bridge" where the complementary strands constituting the bridge are conjugated with the aptamer and the cargo respectively. The design of appropriate sticky bridge sequences however has proven highly challenging given the structural and functional constraints imposed on them during synthesis and administration. These include, but are not limited to, guaranteed formation and stability of the complex, non-interference with the aptamer or the cargo, as well as the prevention of spurious aggregation of the molecules during incubation. In order to address these issues, we have developed AptaBlocks - a computational method to design RNA complexes that hybridize via sticky bridges. The effectiveness of our approach has been verified computationally, and experimentally in the context of drug delivery to pancreatic cancer cells. Importantly, AptaBlocks is a general method for the assembly of nucleic acid systems that, in addition to designing of RNA-based drug delivery systems, can be used in other applications of RNA nanotechnology. AptaBlocks is available at https://github.com/wyjhxq/AptaBlocks.


Assuntos
Algoritmos , Aptâmeros de Nucleotídeos/metabolismo , Biologia Computacional/métodos , Sistemas de Liberação de Medicamentos/métodos , Preparações Farmacêuticas/administração & dosagem , RNA/metabolismo , Aptâmeros de Nucleotídeos/química , Aptâmeros de Nucleotídeos/genética , Linhagem Celular Tumoral , Humanos , Internet , Nanotecnologia/métodos , Neoplasias/genética , Neoplasias/metabolismo , Preparações Farmacêuticas/química , RNA/química , RNA/genética , Reprodutibilidade dos Testes
8.
Bioinformatics ; 34(2): 330-337, 2018 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-29028923

RESUMO

MOTIVATION: Cancers arise as the result of somatically acquired changes in the DNA of cancer cells. However, in addition to the mutations that confer a growth advantage, cancer genomes accumulate a large number of somatic mutations resulting from normal DNA damage and repair processes as well as carcinogenic exposures or cancer related aberrations of DNA maintenance machinery. These mutagenic processes often produce characteristic mutational patterns called mutational signatures. The decomposition of a cancer genome's mutation catalog into mutations consistent with such signatures can provide valuable information about cancer etiology. However, the results from different decomposition methods are not always consistent. Hence, one needs to be able to not only decompose a patient's mutational profile into signatures but also establish the accuracy of such decomposition. RESULTS: We proposed two complementary ways of measuring confidence and stability of decomposition results and applied them to analyze mutational signatures in breast cancer genomes. We identified both very stable and highly unstable signatures, as well as signatures that previously have not been associated with breast cancer. We also provided additional support for the novel signatures. Our results emphasize the importance of assessing the confidence and stability of inferred signature contributions. AVAILABILITY AND IMPLEMENTATION: All tools developed in this paper have been implemented in an R package, called SignatureEstimation, which is available from https://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/index.cgi\#signatureestimation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

9.
BMC Med ; 16(1): 150, 2018 08 27.
Artigo em Inglês | MEDLINE | ID: mdl-30145981

RESUMO

BACKGROUND: Personalized, precision, P4, or stratified medicine is understood as a medical approach in which patients are stratified based on their disease subtype, risk, prognosis, or treatment response using specialized diagnostic tests. The key idea is to base medical decisions on individual patient characteristics, including molecular and behavioral biomarkers, rather than on population averages. Personalized medicine is deeply connected to and dependent on data science, specifically machine learning (often named Artificial Intelligence in the mainstream media). While during recent years there has been a lot of enthusiasm about the potential of 'big data' and machine learning-based solutions, there exist only few examples that impact current clinical practice. The lack of impact on clinical practice can largely be attributed to insufficient performance of predictive models, difficulties to interpret complex model predictions, and lack of validation via prospective clinical trials that demonstrate a clear benefit compared to the standard of care. In this paper, we review the potential of state-of-the-art data science approaches for personalized medicine, discuss open challenges, and highlight directions that may help to overcome them in the future. CONCLUSIONS: There is a need for an interdisciplinary effort, including data scientists, physicians, patient advocates, regulatory agencies, and health insurance organizations. Partially unrealistic expectations and concerns about data science-based solutions need to be better managed. In parallel, computational methods must advance more to provide direct benefit to clinical practice.


Assuntos
Medicina de Precisão/métodos , Humanos , Estudos Prospectivos
10.
Bioinformatics ; 33(6): 814-821, 2017 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-27153670

RESUMO

Motivation: Mutual exclusivity is a widely recognized property of many cancer drivers. Knowledge about these relationships can provide important insights into cancer drivers, cancer-driving pathways and cancer subtypes. It can also be used to predict new functional interactions between cancer driving genes and uncover novel cancer drivers. Currently, most of mutual exclusivity analyses are preformed focusing on a limited set of genes in part due to the computational cost required to rigorously compute P -values. Results: To reduce the computing cost and perform less restricted mutual exclusivity analysis, we developed an efficient method to estimate P -values while controlling the mutation rates of individual patients and genes similar to the permutation test. A comprehensive mutual exclusivity analysis allowed us to uncover mutually exclusive pairs, some of which may have relatively low mutation rates. These pairs often included likely cancer drivers that have been missed in previous analyses. More importantly, our results demonstrated that mutual exclusivity can also provide information that goes beyond the interactions between cancer drivers and can, for example, elucidate different mutagenic processes in different cancer groups. In particular, including frequently mutated, long genes such as TTN in our analysis allowed us to observe interesting patterns of APOBEC activity in breast cancer and identify a set of related driver genes that are highly predictive of patient survival. In addition, we utilized our mutual exclusivity analysis in support of a previously proposed model where APOBEC activity is the underlying process that causes TP53 mutations in a subset of breast cancer cases. Availability and Implementation: http://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/index.cgi#wesme. Contact: przytyck@ncbi.nlm.nih.gov. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias da Mama/genética , Biologia Computacional/métodos , Mutação , Feminino , Genes Neoplásicos , Humanos
11.
PLoS Comput Biol ; 13(10): e1005695, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-29023534

RESUMO

The analysis of the mutational landscape of cancer, including mutual exclusivity and co-occurrence of mutations, has been instrumental in studying the disease. We hypothesized that exploring the interplay between co-occurrence, mutual exclusivity, and functional interactions between genes will further improve our understanding of the disease and help to uncover new relations between cancer driving genes and pathways. To this end, we designed a general framework, BeWith, for identifying modules with different combinations of mutation and interaction patterns. We focused on three different settings of the BeWith schema: (i) BeME-WithFun, in which the relations between modules are enriched with mutual exclusivity, while genes within each module are functionally related; (ii) BeME-WithCo, which combines mutual exclusivity between modules with co-occurrence within modules; and (iii) BeCo-WithMEFun, which ensures co-occurrence between modules, while the within module relations combine mutual exclusivity and functional interactions. We formulated the BeWith framework using Integer Linear Programming (ILP), enabling us to find optimally scoring sets of modules. Our results demonstrate the utility of BeWith in providing novel information about mutational patterns, driver genes, and pathways. In particular, BeME-WithFun helped identify functionally coherent modules that might be relevant for cancer progression. In addition to finding previously well-known drivers, the identified modules pointed to other novel findings such as the interaction between NCOR2 and NCOA3 in breast cancer. Additionally, an application of the BeME-WithCo setting revealed that gene groups differ with respect to their vulnerability to different mutagenic processes, and helped us to uncover pairs of genes with potentially synergistic effects, including a potential synergy between mutations in TP53 and the metastasis related DCC gene. Overall, BeWith not only helped us uncover relations between potential driver genes and pathways, but also provided additional insights on patterns of the mutational landscape, going beyond cancer driving mutations. Implementation is available at https://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/software/bewith.html.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes/genética , Neoplasias/genética , Neoplasias/metabolismo , Algoritmos , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/análise , Peptídeos e Proteínas de Sinalização Intracelular/genética , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Mutação/genética
12.
Genome Res ; 24(7): 1209-23, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24985915

RESUMO

Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.


Assuntos
Biologia Computacional/métodos , Drosophila melanogaster/genética , Perfilação da Expressão Gênica , Anotação de Sequência Molecular , Transcriptoma , Animais , Análise por Conglomerados , Drosophila melanogaster/classificação , Evolução Molecular , Éxons , Feminino , Genoma de Inseto , Humanos , Masculino , Motivos de Nucleotídeos , Filogenia , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Edição de RNA , Sítios de Splice de RNA , Splicing de RNA , Reprodutibilidade dos Testes , Sítio de Iniciação de Transcrição
13.
Bioinformatics ; 37(Suppl_1): i7-i8, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34252970
14.
PLoS Comput Biol ; 12(4): e1004821, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-27078128

RESUMO

Recent genome-wide analyses have uncovered a high accumulation of RNA polymerase II (Pol II) at the 5' end of genes. This elevated Pol II presence at promoters, referred to here as Poll II poising, is mainly (but not exclusively) attributed to temporal pausing of transcription during early elongation which, in turn, has been proposed to be a regulatory step for processes that need to be activated "on demand". Yet, the full genome-wide regulatory role of Pol II poising is yet to be delineated. To elucidate the role of Pol II poising in B cell activation, we compared Pol II profiles in resting and activated B cells. We found that while Pol II poised genes generally overlap functionally among different B cell states and correspond to the functional groups previously identified for other cell types, non-poised genes are B cell state specific. Focusing on the changes in transcription activity upon B cell activation, we found that the majority of such changes were from poised to non-poised state. The genes showing this type of transition were functionally enriched in translation, RNA processing and mRNA metabolic process. Interestingly, we also observed a transition from non-poised to poised state. Within this set of genes we identified several Immediate Early Genes (IEG), which were highly expressed in resting B cell and shifted from non-poised to poised state after B cell activation. Thus Pol II poising does not only mark genes for rapid expression in the future, but it is also associated with genes that are silenced after a burst of their expression. Finally, we performed comparative analysis of the presence of G4 motifs in the context of poised versus non-poised but active genes. Interestingly we observed a differential enrichment of these motifs upstream versus downstream of TSS depending on poising status. The enrichment of G4 sequence motifs upstream of TSS of non-poised active genes suggests a potential role of quadruplexes in expression regulation.


Assuntos
Linfócitos B/enzimologia , RNA Polimerase II/metabolismo , Animais , Linfócitos B/citologia , Linfócitos B/imunologia , Biologia Computacional , Quadruplex G , Regulação da Expressão Gênica , Genes Precoces , Ativação Linfocitária , Camundongos , Camundongos Endogâmicos C57BL , Regiões Promotoras Genéticas , Elongação da Transcrição Genética , Sítio de Iniciação de Transcrição
15.
PLoS Comput Biol ; 12(3): e1004747, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26963104

RESUMO

Cancer is now increasingly studied from the perspective of dysregulated pathways, rather than as a disease resulting from mutations of individual genes. A pathway-centric view acknowledges the heterogeneity between genomic profiles from different cancer patients while assuming that the mutated genes are likely to belong to the same pathway and cause similar disease phenotypes. Indeed, network-centric approaches have proven to be helpful for finding genotypic causes of diseases, classifying disease subtypes, and identifying drug targets. In this review, we discuss how networks can be used to help understand patient-to-patient variations and how one can leverage this variability to elucidate interactions between cancer drivers.


Assuntos
Modelos Biológicos , Proteínas de Neoplasias/genética , Proteínas de Neoplasias/metabolismo , Neoplasias/fisiopatologia , Mapeamento de Interação de Proteínas/métodos , Transdução de Sinais , Animais , Simulação por Computador , Predisposição Genética para Doença/genética , Genótipo , Humanos , Fenótipo
16.
Nucleic Acids Res ; 43(12): 5699-707, 2015 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-25870409

RESUMO

High-Throughput (HT) SELEX combines SELEX (Systematic Evolution of Ligands by EXponential Enrichment), a method for aptamer discovery, with massively parallel sequencing technologies. This emerging technology provides data for a global analysis of the selection process and for simultaneous discovery of a large number of candidates but currently lacks dedicated computational approaches for their analysis. To close this gap, we developed novel in-silico methods to analyze HT-SELEX data and utilized them to study the emergence of polymerase errors during HT-SELEX. Rather than considering these errors as a nuisance, we demonstrated their utility for guiding aptamer discovery. Our approach builds on two main advancements in aptamer analysis: AptaMut-a novel technique allowing for the identification of polymerase errors conferring an improved binding affinity relative to the 'parent' sequence and AptaCluster-an aptamer clustering algorithm which is to our best knowledge, the only currently available tool capable of efficiently clustering entire aptamer pools. We applied these methods to an HT-SELEX experiment developing aptamers against Interleukin 10 receptor alpha chain (IL-10RA) and experimentally confirmed our predictions thus validating our computational methods.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mutação , Técnica de Seleção de Aptâmeros/métodos , Software , Algoritmos , Aptâmeros de Nucleotídeos/metabolismo , Simulação por Computador , Subunidade alfa de Receptor de Interleucina-10/metabolismo , Modelos Estatísticos , Mutagênese
17.
Nucleic Acids Res ; 43(12): e82, 2015 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-26007661

RESUMO

Oligonucleotide aptamers represent a novel platform for creating ligands with desired specificity, and they offer many potentially significant advantages over monoclonal antibodies in terms of feasibility, cost, and clinical applicability. However, the isolation of high-affinity aptamer ligands from random oligonucleotide pools has been challenging. Although high-throughput sequencing (HTS) promises to significantly facilitate systematic evolution of ligands by exponential enrichment (SELEX) analysis, the enormous datasets generated in the process pose new challenges for identifying those rare, high-affinity aptamers present in a given pool. We show that emulsion PCR preserves library diversity, preventing the loss of rare high-affinity aptamers that are difficult to amplify. We also demonstrate the importance of using reference targets to eliminate binding candidates with reduced specificity. Using a combination of bioinformatics and functional analyses, we show that the rate of amplification is more predictive than prevalence with respect to binding affinity and that the mutational landscape within a cluster of related aptamers can guide the identification of high-affinity aptamer ligands. Finally, we demonstrate the power of this selection process for identifying cross-species aptamers that can bind human receptors and cross-react with their murine orthologs.


Assuntos
Aptâmeros de Nucleotídeos/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Técnica de Seleção de Aptâmeros/métodos , Animais , Biblioteca Gênica , Humanos , Ligantes , Camundongos , Mutação , Reação em Cadeia da Polimerase , Receptores de Interleucina-10/metabolismo , Membro 9 da Superfamília de Receptores de Fatores de Necrose Tumoral/metabolismo
18.
Bioinformatics ; 31(12): i284-92, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-26072494

RESUMO

MOTIVATION: The data gathered by the Pan-Cancer initiative has created an unprecedented opportunity for illuminating common features across different cancer types. However, separating tissue-specific features from across cancer signatures has proven to be challenging. One of the often-observed properties of the mutational landscape of cancer is the mutual exclusivity of cancer driving mutations. Even though studies based on individual cancer types suggested that mutually exclusive pairs often share the same functional pathway, the relationship between across cancer mutual exclusivity and functional connectivity has not been previously investigated. RESULTS: We introduce a classification of mutual exclusivity into three basic classes: within tissue type exclusivity, across tissue type exclusivity and between tissue type exclusivity. We then combined across-cancer mutual exclusivity with interactions data to uncover pan-cancer dysregulated pathways. Our new method, Mutual Exclusivity Module Cover (MEMCover) not only identified previously known Pan-Cancer dysregulated subnetworks but also novel subnetworks whose across cancer role has not been appreciated well before. In addition, we demonstrate the existence of mutual exclusivity hubs, putatively corresponding to cancer drivers with strong growth advantages. Finally, we show that while mutually exclusive pairs within or across cancer types are predominantly functionally interacting, the pairs in between cancer mutual exclusivity class are more often disconnected in functional networks.


Assuntos
Algoritmos , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Mutação , Neoplasias/genética , Humanos , Neoplasias/classificação
19.
Nucleic Acids Res ; 42(20): 12367-79, 2014 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-25336616

RESUMO

While individual non-B DNA structures have been shown to impact gene expression, their broad regulatory role remains elusive. We utilized genomic variants and expression quantitative trait loci (eQTL) data to analyze genome-wide variation propensities of potential non-B DNA regions and their relation to gene expression. Independent of genomic location, these regions were enriched in nucleotide variants. Our results are consistent with previously observed mutagenic properties of these regions and counter a previous study concluding that G-quadruplex regions have a reduced frequency of variants. While such mutagenicity might undermine functionality of these elements, we identified in potential non-B DNA regions a signature of negative selection. Yet, we found a depletion of eQTL-associated variants in potential non-B DNA regions, opposite to what might be expected from their proposed regulatory role. However, we also observed that genes downstream of potential non-B DNA regions showed higher expression variation between individuals. This coupling between mutagenicity and tolerance for expression variability of downstream genes may be a result of evolutionary adaptation, which allows reconciling mutagenicity of non-B DNA structures with their location in functionally important regions and their potential regulatory role.


Assuntos
DNA/química , Expressão Gênica , Variação Genética , Genoma Humano , Taxa de Mutação , Humanos , Mutagênese , Conformação de Ácido Nucleico , Nucleotídeos/análise , Locos de Características Quantitativas
20.
BMC Genomics ; 16 Suppl 12: S1, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26679564

RESUMO

BACKGROUND: Meiotic recombination hotspots play important roles in various aspects of genomics, but the underlying mechanisms for regulating the locations and strengths of recombination hotspots are not yet fully revealed. Most existing algorithms for estimating recombination rates from sequence polymorphism data can only output average recombination rates of a population, although there is evidence for the heterogeneity in recombination rates among individuals. For genome-wide association studies (GWAS) of recombination hotspots, an efficient algorithm that estimates the individualized strengths of recombination hotspots is highly desirable. RESULTS: In this work, we propose a novel graph mining algorithm named ARG-walker, based on random walks on ancestral recombination graphs (ARG), to estimate individual-specific recombination hotspot strengths. Extensive simulations demonstrate that ARG-walker is able to distinguish the hot allele of a recombination hotspot from the cold allele. Integrated with output of ARG-walker, we performed GWAS on the phased haplotype data of the 22 autosome chromosomes of the HapMap Asian population samples of Chinese and Japanese (JPT+CHB). Significant cis-regulatory signals have been detected, which is corroborated by the enrichment of the well-known 13-mer motif CCNCCNTNNCCNC of PRDM9 protein. Moreover, two new DNA motifs have been identified in the flanking regions of the significantly associated SNPs (single nucleotide polymorphisms), which are likely to be new cis-regulatory elements of meiotic recombination hotspots of the human genome. CONCLUSIONS: Our results on both simulated and real data suggest that ARG-walker is a promising new method for estimating the individual recombination variations. In the future, it could be used to uncover the mechanisms of recombination regulation and human diseases related with recombination hotspots.


Assuntos
Povo Asiático/genética , Estudo de Associação Genômica Ampla/métodos , Recombinação Homóloga , Metagenômica/métodos , Sequências Reguladoras de Ácido Nucleico , Algoritmos , China , Simulação por Computador , Variação Genética , Genoma Humano , Histona-Lisina N-Metiltransferase/genética , Humanos , Japão , Meiose , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA