Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Bioinformatics ; 38(17): 4127-4134, 2022 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-35792837

RESUMO

MOTIVATION: Inferring gene regulatory networks in non-independent genetically related panels is a methodological challenge. This hampers evolutionary and biological studies using heterozygote individuals such as in wild sunflower populations or cultivated hybrids. RESULTS: First, we simulated 100 datasets of gene expressions and polymorphisms, displaying the same gene expression distributions, heterozygosities and heritabilities as in our dataset including 173 genes and 353 genotypes measured in sunflower hybrids. Secondly, we performed a meta-analysis based on six inference methods [least absolute shrinkage and selection operator (Lasso), Random Forests, Bayesian Networks, Markov Random Fields, Ordinary Least Square and fast inference of networks from directed regulation (Findr)] and selected the minimal density networks for better accuracy with 64 edges connecting 79 genes and 0.35 area under precision and recall (AUPR) score on average. We identified that triangles and mutual edges are prone to errors in the inferred networks. Applied on classical datasets without heterozygotes, our strategy produced a 0.65 AUPR score for one dataset of the DREAM5 Systems Genetics Challenge. Finally, we applied our method to an experimental dataset from sunflower hybrids. We successfully inferred a network composed of 105 genes connected by 106 putative regulations with a major connected component. AVAILABILITY AND IMPLEMENTATION: Our inference methodology dedicated to genomic and transcriptomic data is available at https://forgemia.inra.fr/sunrise/inference_methods. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Reguladoras de Genes , Transcriptoma , Humanos , Heterozigoto , Teorema de Bayes , Genômica , Algoritmos
2.
Proteins ; 89(11): 1522-1529, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34228826

RESUMO

Structure-based computational protein design (CPD) refers to the problem of finding a sequence of amino acids which folds into a specific desired protein structure, and possibly fulfills some targeted biochemical properties. Recent studies point out the particularly rugged CPD energy landscape, suggesting that local search optimization methods should be designed and tuned to easily escape local minima attraction basins. In this article, we analyze the performance and search dynamics of an iterated local search (ILS) algorithm enhanced with partition crossover. Our algorithm, PILS, quickly finds local minima and escapes their basins of attraction by solution perturbation. Additionally, the partition crossover operator exploits the structure of the residue interaction graph in order to efficiently mix solutions and find new unexplored basins. Our results on a benchmark of 30 proteins of various topology and size show that PILS consistently finds lower energy solutions compared to Rosetta fixbb and a classic ILS, and that the corresponding sequences are mostly closer to the native.


Assuntos
Algoritmos , Aminoácidos/química , Engenharia de Proteínas/métodos , Proteínas/química , Software , Sequência de Aminoácidos , Benchmarking , Biologia Computacional , Conformação Proteica , Dobramento de Proteína , Termodinâmica
3.
Bioinformatics ; 34(15): 2581-2589, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29474517

RESUMO

Motivation: Accurate and economic methods to predict change in protein binding free energy upon mutation are imperative to accelerate the design of proteins for a wide range of applications. Free energy is defined by enthalpic and entropic contributions. Following the recent progresses of Artificial Intelligence-based algorithms for guaranteed NP-hard energy optimization and partition function computation, it becomes possible to quickly compute minimum energy conformations and to reliably estimate the entropic contribution of side-chains in the change of free energy of large protein interfaces. Results: Using guaranteed Cost Function Network algorithms, Rosetta energy functions and Dunbrack's rotamer library, we developed and assessed EasyE and JayZ, two methods for binding affinity estimation that ignore or include conformational entropic contributions on a large benchmark of binding affinity experimental measures. If both approaches outperform most established tools, we observe that side-chain conformational entropy brings little or no improvement on most systems but becomes crucial in some rare cases. Availability and implementation: as open-source Python/C++ code at sourcesup.renater.fr/projects/easy-jayz. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Mutação , Ligação Proteica , Proteínas/química , Termodinâmica , Animais , Bactérias/genética , Bactérias/metabolismo , Entropia , Humanos , Conformação Proteica , Proteínas/genética , Proteínas/metabolismo , Software
4.
Bioinformatics ; 29(17): 2129-36, 2013 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-23842814

RESUMO

MOTIVATION: The main challenge for structure-based computational protein design (CPD) remains the combinatorial nature of the search space. Even in its simplest fixed-backbone formulation, CPD encompasses a computationally difficult NP-hard problem that prevents the exact exploration of complex systems defining large sequence-conformation spaces. RESULTS: We present here a CPD framework, based on cost function network (CFN) solving, a recent exact combinatorial optimization technique, to efficiently handle highly complex combinatorial spaces encountered in various protein design problems. We show that the CFN-based approach is able to solve optimality a variety of complex designs that could often not be solved using a usual CPD-dedicated tool or state-of-the-art exact operations research tools. Beyond the identification of the optimal solution, the global minimum-energy conformation, the CFN-based method is also able to quickly enumerate large ensembles of suboptimal solutions of interest to rationally build experimental enzyme mutant libraries. AVAILABILITY: The combined pipeline used to generate energetic models (based on a patched version of the open source solver Osprey 2.0), the conversion to CFN models (based on Perl scripts) and CFN solving (based on the open source solver toulbar2) are all available at http://genoweb.toulouse.inra.fr/~tschiex/CPD


Assuntos
Conformação Proteica , Engenharia de Proteínas/métodos , Algoritmos , Modelos Moleculares , Proteínas/química , Análise de Sequência de Proteína , Software
5.
Bioinformatics ; 26(24): 3035-42, 2010 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-21076149

RESUMO

MOTIVATION: Genome maps are imperative to address the genetic basis of the biology of an organism. While a growing number of genomes are being sequenced providing the ultimate genome maps-this being done at an even faster pace now using new generation sequencers-the process of constructing intermediate maps to build and validate a genome assembly remains an important component for producing complete genome sequences. However, current mapping approach lack statistical confidence measures necessary to identify precisely relevant inconsistencies between a genome map and an assembly. RESULTS: We propose new methods to derive statistical measures of confidence on genome maps using a comparative model for radiation hybrid data. We describe algorithms allowing to (i) sample from a distribution of maps and (ii) exploit this distribution to construct robust maps. We provide an example of application of these methods on a dog dataset that demonstrates the interest of our approach. AVAILABILITY: Methods are implemented in two freely available softwares: Carthagene (http://www.inra.fr/mia/T/CarthaGene/) and a companion software (metamap, available at: http://snp.toulouse.inra.fr/~servin/index.cgi/Metamap).


Assuntos
Mapeamento Cromossômico/métodos , Genoma , Algoritmos , Animais , Cromossomos de Mamíferos , Interpretação Estatística de Dados , Cães , Marcadores Genéticos , Mapeamento de Híbridos Radioativos/métodos , Software
6.
BMC Genomics ; 8: 254, 2007 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-17655763

RESUMO

BACKGROUND: Radiation hybrid (RH) maps are considered to be a tool of choice for fine mapping closely linked loci, considering that the resolution of linkage maps is determined by the number of informative meiosis and recombination events which may require very large mapping populations. Accurately defining the marker order on chromosomes is crucial for correct identification of quantitative trait loci (QTL), haplotype map construction and refinement of candidate gene searches. RESULTS: A 12 k Radiation hybrid map of bovine chromosome 14 was constructed using 843 single nucleotide polymorphism markers. The resulting map was aligned with the latest version of the bovine assembly (Btau_3.1) as well as other previously published RH maps. The resulting map identified distinct regions on Bovine chromosome 14 where discrepancies between this RH map and the bovine assembly occur. A major region of discrepancy was found near the centromere involving the arrangement and order of the scaffolds from the assembly. The map further confirms previously published conserved synteny blocks with human chromosome 8. As well, it identifies an extra breakpoint and conserved synteny block previously undetected due to lower marker density. This conserved synteny block is in a region where markers between the RH map presented here and the latest sequence assembly are in very good agreement. CONCLUSION: The increase of publicly available markers shifts the rate limiting step from marker discovery to the correct identification of their order for further use by the research community. This high resolution map of bovine chromosome 14 will facilitate identification of regions in the sequence assembly where additional information is required to resolve marker ordering.


Assuntos
Cromossomos Humanos Par 8/genética , Cromossomos de Mamíferos/genética , Mapeamento de Híbridos Radioativos/métodos , Animais , Bovinos , Ligação Genética , Marcadores Genéticos/genética , Humanos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Sintenia
7.
J Chem Theory Comput ; 11(12): 5980-9, 2015 Dec 08.
Artigo em Inglês | MEDLINE | ID: mdl-26610100

RESUMO

In Computational Protein Design (CPD), assuming a rigid backbone and amino-acid rotamer library, the problem of finding a sequence with an optimal conformation is NP-hard. In this paper, using Dunbrack's rotamer library and Talaris2014 decomposable energy function, we use an exact deterministic method combining branch and bound, arc consistency, and tree-decomposition to provenly identify the global minimum energy sequence-conformation on full-redesign problems, defining search spaces of size up to 10(234). This is achieved on a single core of a standard computing server, requiring a maximum of 66GB RAM. A variant of the algorithm is able to exhaustively enumerate all sequence-conformations within an energy threshold of the optimum. These proven optimal solutions are then used to evaluate the frequencies and amplitudes, in energy and sequence, at which an existing CPD-dedicated simulated annealing implementation may miss the optimum on these full redesign problems. The probability of finding an optimum drops close to 0 very quickly. In the worst case, despite 1,000 repeats, the annealing algorithm remained more than 1 Rosetta unit away from the optimum, leading to design sequences that could differ from the optimal sequence by more than 30% of their amino acids.


Assuntos
Algoritmos , Proteínas/química , Biologia Computacional , Proteínas/metabolismo , Termodinâmica
8.
PLoS One ; 6(12): e29165, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22216195

RESUMO

Modern technologies and especially next generation sequencing facilities are giving a cheaper access to genotype and genomic data measured on the same sample at once. This creates an ideal situation for multifactorial experiments designed to infer gene regulatory networks. The fifth "Dialogue for Reverse Engineering Assessments and Methods" (DREAM5) challenges are aimed at assessing methods and associated algorithms devoted to the inference of biological networks. Challenge 3 on "Systems Genetics" proposed to infer causal gene regulatory networks from different genetical genomics data sets. We investigated a wide panel of methods ranging from Bayesian networks to penalised linear regressions to analyse such data, and proposed a simple yet very powerful meta-analysis, which combines these inference methods. We present results of the Challenge as well as more in-depth analysis of predicted networks in terms of structure and reliability. The developed meta-analysis was ranked first among the 16 teams participating in Challenge 3A. It paves the way for future extensions of our inference method and more accurate gene network estimates in the context of genetical genomics.


Assuntos
Teorema de Bayes , Redes Reguladoras de Genes , Genômica , Mutação
9.
Bioinformatics ; 21(8): 1703-4, 2005 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-15598829

RESUMO

UNLABELLED: CAR(H)(T)A GENE: is an integrated genetic and radiation hybrid (RH) mapping tool which can deal with multiple populations, including mixtures of genetic and RH data. CAR(H)(T)A GENE: performs multipoint maximum likelihood estimations with accelerated expectation-maximization algorithms for some pedigrees and has sophisticated algorithms for marker ordering. Dedicated heuristics for framework mapping are also included. CAR(H)(T)A GENE: can be used as a C++ library, through a shell command and a graphical interface. The XML output for companion tools is integrated. AVAILABILITY: The program is available free of charge from www.inra.fr/bia/T/CarthaGene for Linux, Windows and Solaris machines (with Open Source). CONTACT: tschiex@toulouse.inra.fr.


Assuntos
Cruzamentos Genéticos , Genética Populacional , Modelos Genéticos , Mapeamento de Híbridos Radioativos/métodos , Software , Interface Usuário-Computador , Mapeamento Cromossômico/métodos , Gráficos por Computador , Simulação por Computador , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA