Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Nat Commun ; 13(1): 4534, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35927228

RESUMO

Assessing tumour gene fitness in physiologically-relevant model systems is challenging due to biological features of in vivo tumour regeneration, including extreme variations in single cell lineage progeny. Here we develop a reproducible, quantitative approach to pooled genetic perturbation in patient-derived xenografts (PDXs), by encoding single cell output from transplanted CRISPR-transduced cells in combination with a Bayesian hierarchical model. We apply this to 181 PDX transplants from 21 breast cancer patients. We show that uncertainty in fitness estimates depends critically on the number of transplant cell clones and the variability in clone sizes. We use a pathway-directed allelic series to characterize Notch signaling, and quantify TP53 / MDM2 drug-gene conditional fitness in outlier patients. We show that fitness outlier identification can be mirrored by pharmacological perturbation. Overall, we demonstrate that the gene fitness landscape in breast PDXs is dominated by inter-patient differences.


Assuntos
Neoplasias da Mama , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Animais , Teorema de Bayes , Neoplasias da Mama/genética , Modelos Animais de Doenças , Feminino , Xenoenxertos , Humanos , Ensaios Antitumorais Modelo de Xenoenxerto
2.
Nature ; 595(7868): 585-590, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34163070

RESUMO

Progress in defining genomic fitness landscapes in cancer, especially those defined by copy number alterations (CNAs), has been impeded by lack of time-series single-cell sampling of polyclonal populations and temporal statistical models1-7. Here we generated 42,000 genomes from multi-year time-series single-cell whole-genome sequencing of breast epithelium and primary triple-negative breast cancer (TNBC) patient-derived xenografts (PDXs), revealing the nature of CNA-defined clonal fitness dynamics induced by TP53 mutation and cisplatin chemotherapy. Using a new Wright-Fisher population genetics model8,9 to infer clonal fitness, we found that TP53 mutation alters the fitness landscape, reproducibly distributing fitness over a larger number of clones associated with distinct CNAs. Furthermore, in TNBC PDX models with mutated TP53, inferred fitness coefficients from CNA-based genotypes accurately forecast experimentally enforced clonal competition dynamics. Drug treatment in three long-term serially passaged TNBC PDXs resulted in cisplatin-resistant clones emerging from low-fitness phylogenetic lineages in the untreated setting. Conversely, high-fitness clones from treatment-naive controls were eradicated, signalling an inversion of the fitness landscape. Finally, upon release of drug, selection pressure dynamics were reversed, indicating a fitness cost of treatment resistance. Together, our findings define clonal fitness linked to both CNA and therapeutic resistance in polyclonal tumours.


Assuntos
Variações do Número de Cópias de DNA , Resistencia a Medicamentos Antineoplásicos , Neoplasias de Mama Triplo Negativas/genética , Animais , Linhagem Celular Tumoral , Cisplatino/farmacologia , Células Clonais/patologia , Feminino , Aptidão Genética , Humanos , Camundongos , Modelos Estatísticos , Transplante de Neoplasias , Proteína Supressora de Tumor p53/genética , Sequenciamento Completo do Genoma
3.
PLoS Comput Biol ; 16(9): e1008270, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32966276

RESUMO

We present Epiclomal, a probabilistic clustering method arising from a hierarchical mixture model to simultaneously cluster sparse single-cell DNA methylation data and impute missing values. Using synthetic and published single-cell CpG datasets, we show that Epiclomal outperforms non-probabilistic methods and can handle the inherent missing data characteristic that dominates single-cell CpG genome sequences. Using newly generated single-cell 5mCpG sequencing data, we show that Epiclomal discovers sub-clonal methylation patterns in aneuploid tumour genomes, thus defining epiclones that can match or transcend copy number-determined clonal lineages and opening up an important form of clonal analysis in cancer. Epiclomal is written in R and Python and is available at https://github.com/shahcompbio/Epiclomal.


Assuntos
Metilação de DNA , Análise de Célula Única , Análise por Conglomerados , Ilhas de CpG , Humanos , Probabilidade , Análise de Sequência de DNA/métodos
4.
Methods Mol Biol ; 1097: 45-70, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24639154

RESUMO

The stability of RNA secondary structure can be predicted using a set of nearest neighbor parameters. These parameters are widely used by algorithms that predict secondary structure. This contribution introduces the UV optical melting experiments that are used to determine the folding stability of short RNA strands. It explains how the nearest neighbor parameters are chosen and how the values are fit to the data. A sample nearest neighbor calculation is provided. The contribution concludes with new methods that use the database of sequences with known structures to determine parameter values.


Assuntos
Dobramento de RNA , RNA/química , Algoritmos , Biologia Computacional/métodos , Conformação de Ácido Nucleico , Termodinâmica
5.
Methods ; 58(3): 277-88, 2012 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-22776363

RESUMO

Accumulating evidence demonstrates that the three-dimensional (3D) organization of chromosomes within the eukaryotic nucleus reflects and influences genomic activities, including transcription, DNA replication, recombination and DNA repair. In order to uncover structure-function relationships, it is necessary first to understand the principles underlying the folding and the 3D arrangement of chromosomes. Chromosome conformation capture (3C) provides a powerful tool for detecting interactions within and between chromosomes. A high throughput derivative of 3C, chromosome conformation capture on chip (4C), executes a genome-wide interrogation of interaction partners for a given locus. We recently developed a new method, a derivative of 3C and 4C, which, similar to Hi-C, is capable of comprehensively identifying long-range chromosome interactions throughout a genome in an unbiased fashion. Hence, our method can be applied to decipher the 3D architectures of genomes. Here, we provide a detailed protocol for this method.


Assuntos
Mapeamento Cromossômico/métodos , Genoma Fúngico , Saccharomyces cerevisiae/genética , Animais , Biotinilação , Reagentes de Ligações Cruzadas/química , Clivagem do DNA , Enzimas de Restrição do DNA/química , DNA Circular/química , DNA Circular/genética , DNA Circular/isolamento & purificação , DNA Fúngico/química , DNA Fúngico/genética , Formaldeído/química , Biblioteca Gênica , Humanos , Conformação de Ácido Nucleico , Curva ROC , Análise de Sequência de DNA
6.
RNA ; 16(12): 2304-18, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20940338

RESUMO

Methods for efficient and accurate prediction of RNA structure are increasingly valuable, given the current rapid advances in understanding the diverse functions of RNA molecules in the cell. To enhance the accuracy of secondary structure predictions, we developed and refined optimization techniques for the estimation of energy parameters. We build on two previous approaches to RNA free-energy parameter estimation: (1) the Constraint Generation (CG) method, which iteratively generates constraints that enforce known structures to have energies lower than other structures for the same molecule; and (2) the Boltzmann Likelihood (BL) method, which infers a set of RNA free-energy parameters that maximize the conditional likelihood of a set of reference RNA structures. Here, we extend these approaches in two main ways: We propose (1) a max-margin extension of CG, and (2) a novel linear Gaussian Bayesian network that models feature relationships, which effectively makes use of sparse data by sharing statistical strength between parameters. We obtain significant improvements in the accuracy of RNA minimum free-energy pseudoknot-free secondary structure prediction when measured on a comprehensive set of 2518 RNA molecules with reference structures. Our parameters can be used in conjunction with software that predicts RNA secondary structures, RNA hybridization, or ensembles of structures. Our data, software, results, and parameter sets in various formats are freely available at http://www.cs.ubc.ca/labs/beta/Projects/RNA-Params.


Assuntos
Biologia Computacional/métodos , Metabolismo Energético/fisiologia , RNA/química , RNA/metabolismo , Estatística como Assunto/métodos , Algoritmos , Animais , Composição de Bases , Sequência de Bases , Biologia Computacional/estatística & dados numéricos , Humanos , Modelos Teóricos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Análise de Sequência de RNA
7.
Nature ; 465(7296): 363-7, 2010 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-20436457

RESUMO

Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or 'factories' for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.


Assuntos
Posicionamento Cromossômico/fisiologia , Cromossomos Fúngicos/metabolismo , Genoma Fúngico , Imageamento Tridimensional , Espaço Intranuclear/metabolismo , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Nucléolo Celular/genética , Nucléolo Celular/metabolismo , Núcleo Celular/genética , Núcleo Celular/metabolismo , Centrômero/genética , Centrômero/metabolismo , Pontos de Quebra do Cromossomo , Cromossomos Fúngicos/genética , Replicação do DNA , Haploidia , RNA de Transferência/genética , Origem de Replicação/genética
8.
BMC Bioinformatics ; 11: 105, 2010 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-20181279

RESUMO

BACKGROUND: Estimation of DNA duplex hybridization free energy is widely used for predicting cross-hybridizations in DNA computing and microarray experiments. A number of software programs based on different methods and parametrizations are available for the theoretical estimation of duplex free energies. However, significant differences in free energy values are sometimes observed among estimations obtained with various methods, thus being difficult to decide what value is the accurate one. RESULTS: We present in this study a quantitative comparison of the similarities and differences among four published DNA/DNA duplex free energy calculation methods and an extended Nearest-Neighbour Model for perfect matches based on triplet interactions. The comparison was performed on a benchmark data set with 695 pairs of short oligos that we collected and manually curated from 29 publications. Sequence lengths range from 4 to 30 nucleotides and span a large GC-content percentage range. For perfect matches, we propose an extension of the Nearest-Neighbour Model that matches or exceeds the performance of the existing ones, both in terms of correlations and root mean squared errors. The proposed model was trained on experimental data with temperature, sodium and sequence concentration characteristics that span a wide range of values, thus conferring the model a higher power of generalization when used for free energy estimations of DNA duplexes under non-standard experimental conditions. CONCLUSIONS: Based on our preliminary results, we conclude that no statistically significant differences exist among free energy approximations obtained with 4 publicly available and widely used programs, when benchmarked against a collection of 695 pairs of short oligos collected and curated by the authors of this work based on 29 publications. The extended Nearest-Neighbour Model based on triplet interactions presented in this work is capable of performing accurate estimations of free energies for perfect match duplexes under both standard and non-standard experimental conditions and may serve as a baseline for further developments in this area of research.


Assuntos
DNA/química , Termodinâmica , Algoritmos , Composição de Bases , Sequência de Bases , Modelos Moleculares , Conformação de Ácido Nucleico , Hibridização de Ácido Nucleico , Oligonucleotídeos/química
9.
RNA ; 16(1): 26-42, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19933322

RESUMO

Accurate prediction of RNA pseudoknotted secondary structures from the base sequence is a challenging computational problem. Since prediction algorithms rely on thermodynamic energy models to identify low-energy structures, prediction accuracy relies in large part on the quality of free energy change parameters. In this work, we use our earlier constraint generation and Boltzmann likelihood parameter estimation methods to obtain new energy parameters for two energy models for secondary structures with pseudoknots, namely, the Dirks-Pierce (DP) and the Cao-Chen (CC) models. To train our parameters, and also to test their accuracy, we create a large data set of both pseudoknotted and pseudoknot-free secondary structures. In addition to structural data our training data set also includes thermodynamic data, for which experimentally determined free energy changes are available for sequences and their reference structures. When incorporated into the HotKnots prediction algorithm, our new parameters result in significantly improved secondary structure prediction on our test data set. Specifically, the prediction accuracy when using our new parameters improves from 68% to 79% for the DP model, and from 70% to 77% for the CC model.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Sequência de Bases , Metabolismo Energético/fisiologia , Previsões/métodos , Modelos Genéticos , Dados de Sequência Molecular , RNA/análise , Sensibilidade e Especificidade , Análise de Sequência de RNA , Software , Termodinâmica
10.
BMC Bioinformatics ; 9: 340, 2008 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-18700982

RESUMO

BACKGROUND: The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. RESULTS: In this paper we describe RNA STRAND - the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. CONCLUSION: RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at http://www.rnasoft.ca/strand.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Modelos Químicos , Modelos Moleculares , RNA/química , RNA/ultraestrutura , Interface Usuário-Computador , Gráficos por Computador , Simulação por Computador , Armazenamento e Recuperação da Informação/métodos , Conformação de Ácido Nucleico
11.
Bioinformatics ; 23(13): i19-28, 2007 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-17646296

RESUMO

MOTIVATION: Accurate prediction of RNA secondary structure from the base sequence is an unsolved computational challenge. The accuracy of predictions made by free energy minimization is limited by the quality of the energy parameters in the underlying free energy model. The most widely used model, the Turner99 model, has hundreds of parameters, and so a robust parameter estimation scheme should efficiently handle large data sets with thousands of structures. Moreover, the estimation scheme should also be trained using available experimental free energy data in addition to structural data. RESULTS: In this work, we present constraint generation (CG), the first computational approach to RNA free energy parameter estimation that can be efficiently trained on large sets of structural as well as thermodynamic data. Our CG approach employs a novel iterative scheme, whereby the energy values are first computed as the solution to a constrained optimization problem. Then the newly computed energy parameters are used to update the constraints on the optimization function, so as to better optimize the energy parameters in the next iteration. Using our method on biologically sound data, we obtain revised parameters for the Turner99 energy model. We show that by using our new parameters, we obtain significant improvements in prediction accuracy over current state of-the-art methods. AVAILABILITY: Our CG implementation is available at http://www.rnasoft.ca/CG/.


Assuntos
Algoritmos , Modelos Químicos , Modelos Moleculares , RNA/química , RNA/ultraestrutura , Análise de Sequência de RNA/métodos , Sequência de Bases , Simulação por Computador , Dados de Sequência Molecular , Conformação de Ácido Nucleico
12.
Nucleic Acids Res ; 33(15): 4965-77, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16284197

RESUMO

An algorithm is presented for the generation of sets of non-interacting DNA sequences, employing existing thermodynamic models for the prediction of duplex stabilities and secondary structures. A DNA 'word' structure is employed in which individual DNA 'words' of a given length (e.g. 12mer and 16mer) may be concatenated into longer sequences (e.g. four tandem words and six tandem words). This approach, where multiple word variants are used at each tandem word position, allows very large sets of non-interacting DNA strands to be assembled from combinations of the individual words. Word sets were generated and their figures of merit are compared to sets as described previously in the literature (e.g. 4, 8, 12, 15 and 16mer). The predicted hybridization behavior was experimentally verified on selected members of the sets using standard UV hyperchromism measurements of duplex melting temperatures (T(m)s). Additional experimental validation was obtained by using the sequences in formulating and solving a small example of a DNA computing problem.


Assuntos
Algoritmos , DNA/química , Análise de Sequência de DNA/métodos , Termodinâmica , Sequência de Bases , Biologia Computacional/métodos , Citosina/química , Guanina/química , Conformação de Ácido Nucleico , Desnaturação de Ácido Nucleico , Ácidos Nucleicos Heteroduplexes/química , Hibridização de Ácido Nucleico , Temperatura
13.
Nucleic Acids Res ; 33(15): 4951-64, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16145053

RESUMO

We describe a new algorithm for design of strand sets, for use in DNA computations or universal microarrays. Our algorithm can design sets that satisfy any of several thermodynamic and combinatorial constraints, which aim to maximize desired hybridizations between strands and their complements, while minimizing undesired cross-hybridizations. To heuristically search for good strand sets, our algorithm uses a conflict-driven stochastic local search approach, which is known to be effective in solving comparable search problems. The PairFold program of Andronescu et al. [M. Andronescu, Z. C. Zhang and A. Condon (2005) J. Mol. Biol., 345, 987-1001; M. Andronescu, R. Aguirre-Hernandez, A. Condon, and H. Hoos (2003) Nucleic Acids Res., 31, 3416-3422.] is used to calculate the minimum free energy of hybridization between two mismatched strands. We describe new thermodynamic measures of the quality of strand sets. With respect to these measures of quality, our algorithm consistently finds, within reasonable time, sets that are significantly better than previously published sets in the literature.


Assuntos
Algoritmos , Sondas de DNA , Termodinâmica , Biologia Computacional , Sondas de DNA/química , Análise de Sequência com Séries de Oligonucleotídeos , Processos Estocásticos
14.
J Mol Biol ; 345(5): 987-1001, 2005 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-15644199

RESUMO

Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.


Assuntos
Simulação por Computador , Conformação de Ácido Nucleico , RNA/química , RNA/metabolismo , Algoritmos , Sequência de Bases , Dados de Sequência Molecular , Hibridização de Ácido Nucleico , Oligorribonucleotídeos/química , Oligorribonucleotídeos/genética , Oligorribonucleotídeos/metabolismo , RNA/genética , RNA Catalítico/química , RNA Catalítico/genética , RNA Catalítico/metabolismo , RNA Nuclear Pequeno/química , RNA Nuclear Pequeno/metabolismo , Sensibilidade e Especificidade , Software , Termodinâmica
15.
J Mol Biol ; 336(3): 607-24, 2004 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-15095976

RESUMO

The function of many RNAs depends crucially on their structure. Therefore, the design of RNA molecules with specific structural properties has many potential applications, e.g. in the context of investigating the function of biological RNAs, of creating new ribozymes, or of designing artificial RNA nanostructures. Here, we present a new algorithm for solving the following RNA secondary structure design problem: given a secondary structure, find an RNA sequence (if any) that is predicted to fold to that structure. Unlike the (pseudoknot-free) secondary structure prediction problem, this problem appears to be hard computationally. Our new algorithm, "RNA Secondary Structure Designer (RNA-SSD)", is based on stochastic local search, a prominent general approach for solving hard combinatorial problems. A thorough empirical evaluation on computationally predicted structures of biological sequences and artificially generated RNA structures as well as on empirically modelled structures from the biological literature shows that RNA-SSD substantially out-performs the best known algorithm for this problem, RNAinverse from the Vienna RNA Package. In particular, the new algorithm is able to solve structures, consistently, for which RNAinverse is unable to find solutions. The RNA-SSD software is publically available under the name of RNA Designer at the RNASoft website (www.rnasoft.ca).


Assuntos
Algoritmos , Conformação de Ácido Nucleico , RNA/química , Sequência de Bases , Simulação por Computador , Bases de Dados Genéticas , Modelos Genéticos , Dados de Sequência Molecular
16.
Nucleic Acids Res ; 31(13): 3416-22, 2003 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-12824338

RESUMO

DNA and RNA strands are employed in novel ways in the construction of nanostructures, as molecular tags in libraries of polymers and in therapeutics. New software tools for prediction and design of molecular structure will be needed in these applications. The RNAsoft suite of programs provides tools for predicting the secondary structure of a pair of DNA or RNA molecules, testing that combinatorial tag sets of DNA and RNA molecules have no unwanted secondary structure and designing RNA strands that fold to a given input secondary structure. The tools are based on standard thermodynamic models of RNA secondary structure formation. RNAsoft can be found online at http://www.RNAsoft.ca.


Assuntos
RNA/química , Software , Sequência de Bases , DNA/química , Internet , Conformação de Ácido Nucleico , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA