Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biosci ; 32(1): 51-70, 2007 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17426380

RESUMO

The description of protein 3D structures can be performed through a library of 3D fragments, named a structural alphabet. Our structural alphabet is composed of 16 small protein fragments of 5 C alpha in length, called protein blocks (PBs). It allows an efficient approximation of the 3D protein structures and a correct prediction of the local structure. The 72 most frequent series of 5 consecutive PBs, called structural words (SWs)are able to cover more than 90% of the 3D structures. PBs are highly conditioned by the presence of a limited number of transitions between them. In this study, we propose a new method called "pinning strategy" that used this specific feature to predict long protein fragments. Its goal is to define highly probable successions of PBs. It starts from the most probable SW and is then extended with overlapping SWs. Starting from an initial prediction rate of 34.4%, the use of the SWs instead of the PBs allows a gain of 4.5%. The pinning strategy simply applied to the SWs increases the prediction accuracy to 39.9%. In a second step, the sequence-structure relationship is optimized, the prediction accuracy reaches 43.6%.


Assuntos
Biologia Computacional/métodos , Conformação Proteica , Proteínas/química , Análise de Sequência de Proteína , Sequência de Aminoácidos , Teorema de Bayes , Bases de Dados de Proteínas , Proteínas de Escherichia coli/química , Dados de Sequência Molecular , Biblioteca de Peptídeos
2.
Nucleic Acids Res ; 34(Web Server issue): W75-8, 2006 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16845113

RESUMO

Protein Peeling 2 (PP2) is a web server for the automatic identification of protein units (PUs) given the 3D coordinates of a protein. PUs are an intermediate level of protein structure description between protein domains and secondary structures. It is a new tool to better understand and analyze the organization of protein structures. PP2 uses only the matrices of protein contact probabilities and cuts the protein structures optimally using Matthews' coefficient correlation. An index assesses the compactness quality of each PU. Results are given both textually and graphically using JMol and PyMol softwares. The server can be accessed from http://www.ebgm.jussieu.fr/~gelly/index.html.


Assuntos
Conformação Proteica , Software , Gráficos por Computador , Internet , Dobramento de Proteína , Proteínas/química , Interface Usuário-Computador
3.
Bioinformatics ; 19(3): 345-53, 2003 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-12584119

RESUMO

MOTIVATION: Our aim is to develop a process that automatically defines a repertory of contiguous 3D protein structure fragments and can be used in homology modeling. We present here improvements to the method we introduced previously: the 'hybrid protein model' (de Brevern and Hazout, THEOR: Chem. Acc., 106, 36-47, (2001)) The hybrid protein learns a non-redundant databank encoded in a structural alphabet composed of 16 Protein Blocks (PBs; de Brevern et al., Proteins, 41, 271-287, (2000)). Every local fold is learned by looking for the most similar pattern present in the hybrid protein and modifying it slightly. Finally each position corresponds to a cluster of similar 3D local folds. RESULTS: In this paper, we describe improvements to our method for building an optimal hybrid protein: (i) 'baby training,' which is defined as the introduction of large structure fragments and the progressive reduction in the size of training fragments; and (ii) the deletion of the redundant parts of the hybrid protein. This repertory of contiguous 3D protein structure fragments should be a useful tool for molecular modeling.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Modelos Moleculares , Proteínas/química , Sequência de Aminoácidos , Inteligência Artificial , Dados de Sequência Molecular , Fragmentos de Peptídeos/química , Conformação Proteica , Dobramento de Proteína , Controle de Qualidade
4.
Proteins ; 46(3): 243-9, 2002 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-11835499

RESUMO

Knowledge of the disulfide bonding state of the cysteines of proteins is of major interest in designing numerous molecular biology experiments, or in predicting their three-dimensional structure. Previous methods using the information gained from aligned sets of sequences have reached up to 82% of success in predicting the oxidation state of cysteines. In the present study, we assess the relative efficiency of different descriptors in predicting the cysteine disulfide bonding states. Our results suggest that the information on the residues flanking the cysteines is less informative about the disulfide bonding state than about the amino acid content of the whole protein. Using a combination of logistic functions learned with subsets of proteins homogeneous in terms of their amino acid content, we propose a simple prediction approach, starting from a single sequence, that reaches success rates close to 84%. This score can be improved by avoiding predictions regarding cysteines for which the decision is not well marked. For example, we obtain a score close to 87% correct prediction when we exclude predicting 10% of the cysteines.


Assuntos
Cisteína/química , Dissulfetos/química , Proteínas/química , Aminoácidos/química , Simulação por Computador , Modelos Logísticos , Modelos Químicos , Estrutura Terciária de Proteína
5.
Bioinformatics ; 17(2): 196-7, 2001 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-11238079

RESUMO

UNLABELLED: MOSAIC is a set of tools for the segmentation of multiple aligned DNA sequences into homogeneous zones. The segmentation is based on the distribution of mutational events along the alignment. As an example, the analysis of one repeated sequence belonging to the subtelomeric regions of the yeast genome is presented. AVAILABILITY: Free access from ftp://ftp.biomath.jussieu.fr/pub/papers/MOSAIC


Assuntos
Alinhamento de Sequência , Análise de Sequência de DNA , Software , Genoma Fúngico , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA/métodos
6.
Proteins ; 41(3): 271-87, 2000 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-11025540

RESUMO

By using an unsupervised cluster analyzer, we have identified a local structural alphabet composed of 16 folding patterns of five consecutive C(alpha) ("protein blocks"). The dependence that exists between successive blocks is explicitly taken into account. A Bayesian approach based on the relation protein block-amino acid propensity is used for prediction and leads to a success rate close to 35%. Sharing sequence windows associated with certain blocks into "sequence families" improves the prediction accuracy by 6%. This prediction accuracy exceeds 75% when keeping the first four predicted protein blocks at each site of the protein. In addition, two different strategies are proposed: the first one defines the number of protein blocks in each site needed for respecting a user-fixed prediction accuracy, and alternatively, the second one defines the different protein sites to be predicted with a user-fixed number of blocks and a chosen accuracy. This last strategy applied to the ubiquitin conjugating enzyme (alpha/beta protein) shows that 91% of the sites may be predicted with a prediction accuracy larger than 77% considering only three blocks per site. The prediction strategies proposed improve our knowledge about sequence-structure dependence and should be very useful in ab initio protein modelling.


Assuntos
Teorema de Bayes , Simulação por Computador , Modelos Moleculares , Fragmentos de Peptídeos/química , Conformação Proteica , Inteligência Artificial , Análise por Conglomerados , Bases de Dados Factuais , Previsões , Ligases , Redes Neurais de Computação , Fragmentos de Peptídeos/classificação , Estrutura Secundária de Proteína , Ubiquitinas/metabolismo
7.
Protein Eng ; 12(12): 1063-73, 1999 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-10611400

RESUMO

The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-alpha-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs. Fitting the model to a database of non-redundant proteins allowed us to decode proteins in terms of 12 distinct SBBs with different roles in protein structure. Some SBBs correspond to classical regular secondary structures. Others correspond to a significant subdivision of their bounding regions previously considered to be a single pattern. The major contribution of the HMM is that this model implicitly takes into account the sequential connections between SBBs and thus describes the most probable pathways by which the blocks are connected to form the framework of the protein structures. Validation of the SBBs code was performed by extracting SBB series repeated in recoding proteins and examining their structural similarities. Preliminary results on the sequence specificity of SBBs suggest promising perspectives for the prediction of SBBs or series of SBBs from the protein sequences.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Aminoácidos/química , Proteínas de Transporte/química , Bases de Dados como Assunto , Proteínas de Escherichia coli , Cadeias de Markov , Modelos Moleculares , Dados de Sequência Molecular , Fragmentos de Peptídeos/química , Conformação Proteica , Estrutura Secundária de Proteína
8.
Bioinformatics ; 15(2): 176-7, 1999 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-10089205

RESUMO

UNLABELLED: PredAcc is a tool for predicting the solvent accessibility of protein residues from the sequence at different relative accessibility levels (0-55%). The prediction rate varies between 70. 7% (for 25% relative accessibility) and 85.7% (for 0% relative accessibility). Amino acids are predicted in four categories: almost certainly hidden and almost certainly exposed with a given a posteriori prediction error, probably hidden and probably exposed otherwise. AVAILABILITY: http://condor.urbb.jussieu.fr/PredAccCfg.html CONTACT: tuffery@urbb.jussieu.fr


Assuntos
Proteínas/química , Software , Aminoácidos/química , Simulação por Computador , Modelos Logísticos , Modelos Químicos , Conformação Proteica , Solventes
9.
Comput Appl Biosci ; 13(5): 497-508, 1997 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-9367123

RESUMO

MOTIVATION: The approaches usually used for building large genetic maps consist of dividing the marker set into linkage groups and provide local orders that can be tested by multi-point linkage analysis. To deal with the limitations of these approaches, a strategy taking the marker set into account globally is defined. RESULTS: The paper presents a new approach called 'Bi-Dimensional Scaling Map (BDS-Map) for inferring marker orders and distances in genetic maps based on the use of an additional dimension orthogonal to the map into which markers are projected. Dynamical forces based on a two-point analysis are applied to tend to optimize the marker locations in space. The efficiency of the approach is exemplified on real data (16 and 70 markers on chromosomes 6 and 2, respectively) and simulated data (50 maps of 70 markers).


Assuntos
Mapeamento Cromossômico/métodos , Software , Algoritmos , Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Ligação Genética , Marcadores Genéticos , Linguagens de Programação , Interface Usuário-Computador
10.
Hum Biol ; 69(3): 419-25, 1997 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-9164051

RESUMO

When analyzed by origin, the frequency of the G542X cystic fibrosis (CF) mutation (the second most common CF mutation in Europe after DF508) varies between population groups in Europe. We show here that the frequency of G542X varies among different towns or regions of origin, being lower in northeastern Europeans than in southwestern Europeans. The G542X mutation mapping that we have defined by a multiple regression of G542X frequencies covers 28 countries (53 geographic points) and is based on data from 50 laboratories. The more elevated values of G542X frequency correspond to ancient sites of occupation by occidental Phoenicians.


Assuntos
Regulador de Condutância Transmembrana em Fibrose Cística/genética , Fibrose Cística/genética , Emigração e Imigração , Frequência do Gene/genética , Mutação/genética , Europa (Continente) , Humanos , Análise de Regressão
11.
Hum Biol ; 69(2): 253-62, 1997 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-9057348

RESUMO

The apolipoprotein E gene (APOE) is located on chromosome 19. The three most common APOE alleles account for most of the corresponding peptide chain variations in most human populations. APOE*3 is the most common allele, coding for the product E3; APOE*2 codes for an Arg-158-->Cys substitution (E2), and APOE*4 codes for a Cys-112-->Arg product (E4). We completed a meta-analysis of APOE allele frequencies from 30 geographically defined populations in Europe, including Iceland and Turkey. We performed a weighted multiple regression using normalized geographic coordinates and a fourth-degree polynomial. Next, we constructed maps showing isofrequencies of the *4 allele in Europe. We found a clear north to south decline in *4 allele frequency for continental Western Europe. No such clinal pattern was apparent for the *2 allele frequencies, but for *3 we found an inverse south to north decreasing gradient. Symmetry between the clines of the *4 and *3 alleles is due to a negative correlation coefficient (r = -0.89). We also plotted APOE allele frequencies against latitude; a decreasing cline was evident for *4 frequencies (y = -0.152 + 0.006x, r = 0.904) and an increasing cline was evident for *3 frequencies (y = 1.087 - 0.006x, r = 0.809). Clines for the APOE alleles could be the result of natural selection.


Assuntos
Apolipoproteínas E/genética , Frequência do Gene , Alelos , Europa (Continente)/epidemiologia , Genética Populacional , Humanos , Incidência , Análise de Regressão
12.
Protein Eng ; 10(4): 361-72, 1997 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-9194160

RESUMO

We have studied the effect of backbone inaccuracy on the efficiency of protein side chain conformation prediction using rotamer libraries. The backbones were generated by randomly perturbing the crystallographic conformation of 12 proteins and exhibit C alpha r.m.s.d.s of up to 2 A. Our results show that, even for a perturbation of the backbone fully compatible with the temperature factors of the proteins, the predicted side chain conformations of approximately 10% of the buried side chains remain variable. This fraction increases further for larger backbone deviations. However, for backbone deviations of up to 2 A r.m.s.d., the predicted side chain r.m.s.d. varies only in a ratio of < 1.4. Moreover, a possible strategy for obtaining side chain conformations close to the experimental ones consists of extracting the consensus conformations of the side chains from a series of backbone conformations. Such a procedure allows the computation of the side chain conformations with no loss of accuracy for backbones exhibiting r.m.s.d.s of up to 1 A from the crystallographic coordinates. For larger backbone deviations (up to 2 A r.m.s.d.) the r.m.s.d. of the buried side chains increases from 1.33 up to 1.60 A. We also discuss the influence of the size of the rotamer library on the quality of the prediction.


Assuntos
Modelos Químicos , Conformação Proteica , Cristalografia por Raios X , Modelos Moleculares
13.
Ann Hum Genet ; 61(Pt 1): 37-47, 1997 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-9066926

RESUMO

The GM immunoglobulin (Ig) allotype distributions of 49 native Amerindian populations from North to South America were analysed by a new technique called 'Mobile Sites Method' (MSM). This allows the global interpretation of genetic diversity in space by means of a distorted geographic map called a 'genetic similarity map'. This approach has been improved by superimposing in the distorted geographic map both the haplotype set (represented by hypothetical populations having a 100% frequency of the haplotype considered) and the 'geography-genetics discontinuities' (i.e. the zones between homogeneous population clusters). This bidimensional representation completes the interpretation of the genetic distances between populations in terms of local genetic diversity and possible migrations. Our results concerning the spatial distribution of the Amerindian populations show: (i) a great interdependence of the geographic locations and the GM haplotype distributions (the importance of the geographic factor was checked with the usual technique of 'random sampling' and the percentage of explained distance variability decreases from 78% with the observed data to a level less than 67% with the random data); (ii) a parallelism between genetics and linguistics groups as indicated by the population clusters in the similarity map, and (iii) a complex distorted map revealing the presence of multiple population migrations and admixtures in the course of time. A particular distortion of South America suggests possible migrations by sea along the western and eastern coasts of Central America, or multiple migration waves without population admixture across Central America.


Assuntos
Genética Populacional , Alótipos Gm de Imunoglobulina/genética , Indígenas Centro-Americanos/genética , Indígenas Norte-Americanos/genética , Indígenas Sul-Americanos/genética , Inuíte/genética , Análise por Conglomerados , Variação Genética , Haplótipos , Humanos , Estatística como Assunto
14.
Genet Couns ; 8(2): 77-81, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-9219003

RESUMO

We have collected 76 parent-offspring (CAG)n values in 60 French Huntington's disease (HD) pedigrees. The analysis of intergenerational alterations in CAG repeat length shows that there is a correlation between repeat instability and parental repeat length. Paternally inherited cases are characterized by a preferential trend towards an increase in range of repeat sizes in offspring of HD patients.


Assuntos
Doença de Huntington/genética , Meiose/genética , Proteínas do Tecido Nervoso/genética , Proteínas Nucleares/genética , Repetições de Trinucleotídeos/genética , Adulto , DNA/genética , Feminino , Marcadores Genéticos/genética , Testes Genéticos , Variação Genética , Humanos , Proteína Huntingtina , Doença de Huntington/diagnóstico , Masculino , Sequências Repetitivas de Ácido Nucleico
15.
J Mol Evol ; 42(4): 472-5, 1996 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-8642617

RESUMO

One Y-specific DNA polymorphism (p49/ TaqI) was studied in a sample of 97 French Basques and compared with those found in 7 other French, Iberian, and Italian populations. A particularly high frequency (72.2%) of Y-haplotype XV was observed in Basques, compared to values (mean of 41%) obtained in other Western Europeans. Basques were also characterized by virtual absence, or presence at a low level, of the South or Near Eastern haplotypes XII, VII, and VIII. Considered together, these results confirm that Basques are a very ancient European population which has had little previous contact with the Neolithics.


Assuntos
Haplótipos , Polimorfismo Genético , População Branca/genética , Cromossomo Y/genética , Evolução Molecular , França/etnologia , Humanos , Itália/etnologia , Portugal/etnologia , Espanha/etnologia
16.
Hum Biol ; 67(5): 797-803, 1995 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-8543293

RESUMO

The frequencies of DF508, the main cystic fibrosis mutation, vary among different populations in Western Europe; they are higher in northwestern Europeans than in southeastern populations. Our new analysis is based on results from 66 different laboratories on 17,886 cystic fibrosis chromosomes (from 70 locations and 26 countries). The correlation between DF508 frequency values and cystic fibrosis incidence is calculated in the corresponding groups.


Assuntos
Regulador de Condutância Transmembrana em Fibrose Cística/genética , Fibrose Cística/genética , Frequência do Gene , Mutação , Fibrose Cística/epidemiologia , Europa (Continente)/epidemiologia , Deleção de Genes , Genética Populacional , Humanos , Incidência , Análise de Regressão
17.
Hum Biol ; 67(4): 562-76, 1995 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-7649531

RESUMO

Examination of the European geographic patterns of the 10 relatively most frequent cystic fibrosis mutations, other than the DF508 one, shows that a founder effect is apparent for a number of them. The most evident examples are for the W1282X mutation in Jews, with a probable Asian origin, and the G551D and R117H mutations in Celts. Geographic distributions indicate that the main focus of the 621 + 1 G-->T and DI507 mutations is probably located in Wales. Also, the R1162X mutation probably originates from a circumscribed north Italian region. The N1303K mutation has a wide range in Europe with a clear preponderance in southern countries. Even the relatively common G542X and 1717.1 G-->A mutations have a local preponderance in Spain and Sicily and in northern Italy, respectively. Likelihood estimates for recurrent mutation and identity by descent strongly support the hypothesis of recurrence for the (mainly German) mutation R553X.


Assuntos
Fibrose Cística/genética , Efeito Fundador , Europa (Continente) , Frequência do Gene , Humanos , Mutação Puntual/genética , Polimorfismo Genético
18.
Ann Hum Biol ; 22(3): 183-98, 1995.
Artigo em Inglês | MEDLINE | ID: mdl-7574444

RESUMO

The distribution of surnames in 90 distinct regions in France during two successive periods, 1889-1915 and 1916-1940, is analysed from the civil birth registers of the 36,500 administrative units in France. A new approach, called 'Mobile Site Method' (MSM), is developed to allow representation of a surname distance matrix by a distorted geographical map. A surname distance matrix between the various regions in France is first calculated, then a distorted geographical map called the 'surname similarity map' is built up from the surname distances between regions. To interpret this map we draw (a) successive map contours obtained during the step-by-step distortion process, revealing zones of high surname dissimilarity, and (b) maps in grey levels representing the displacement magnitude, and allowing the segmentation of the geographical and surname maps into 'homogeneous surname zones'. By integrating geography and surname information in the same analysis, and by comparing results obtained for the two successive periods, the MSM approach produces convenient maps showing: (a) 'regionalism' of some peripheral populations such as Pays Basque, Alsace, Corsica and Brittany; (b) the presence of preferential axes of communications (Rhodanian corridor, Garonne valley); (c) barriers such as the Central Massif, Vosges; (d) the weak modifications of the distorted maps associated with the two periods studied suggest an extension (but limited) of the tendency of surname uniformity in France. These results are interpreted, in the nineteenth- and twentieth century context, as the consequences of a slow process of local migrations occurring over a long period of time.


Assuntos
Nomes , França , Genética Populacional , Humanos , Mapas como Assunto
19.
Hum Biol ; 67(2): 231-49, 1995 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-7537245

RESUMO

GM haplotype frequencies were examined in 49 Amerindian tribes (from North, Central, and South America) to investigate the congruence of genetic variation with that observed in language and geography. We used two approaches: (1) the mobile site method, which allows a two-dimensional representation of genetic variation where the distances between reference points (i.e., the locations of the populations in the geographic map after displacements) are close to the genetic distances, and (2) a multivariate analysis (factorial correspondence analysis), which permits a visual interpretation of the geographic distribution of GM haplotypes on a map, completed by a cluster analysis. The results show a strong gradient from the Bering Strait to South America. The Eskimo and Na-Dene are genetically different from all other Amerindians, reflecting their more recent migrations. The orientation of most trajectories of the tribes from Central and South America can be interpreted as earlier migrations along the Pacific and Atlantic coasts. We conclude that geographic and linguistic factors played a part in the genetic diversity of Amerindian tribes.


Assuntos
Indígena Americano ou Nativo do Alasca , Emigração e Imigração , Etnicidade , Variação Genética , Alótipos Gm de Imunoglobulina/genética , Linguística , América , Análise por Conglomerados , Frequência do Gene , Haplótipos , Humanos , Análise Multivariada
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...