Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Science ; 227(4693): 1435-41, 1985 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-2983426

RESUMO

An algorithm was developed which facilitates the search for similarities between newly determined amino acid sequences and sequences already available in databases. Because of the algorithm's efficiency on many microcomputers, sensitive protein database searches may now become a routine procedure for molecular biologists. The method efficiently identifies regions of similar sequence and then scores the aligned identical and differing residues in those regions by means of an amino acid replacability matrix. This matrix increases sensitivity by giving high scores to those amino acid replacements which occur frequently in evolution. The algorithm has been implemented in a computer program designed to search protein databases very rapidly. For example, comparison of a 200-amino-acid sequence to the 500,000 residues in the National Biomedical Research Foundation library would take less than 2 minutes on a minicomputer, and less than 10 minutes on a microcomputer (IBM PC).


Assuntos
Sequência de Aminoácidos , Computadores , Proteínas , Software , Angiotensinogênio , Animais , Evolução Biológica , Bunyaviridae , Bovinos , AMP Cíclico/farmacologia , Grupo dos Citocromos c , Humanos , Sistemas de Informação , Microcomputadores , Nucleoproteínas , Probabilidade , Proteínas Quinases , Precursores de Proteínas , Ratos , Proteínas Virais
2.
Biochim Biophys Acta ; 1190(1): 189-92, 1994 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-8110815

RESUMO

A cDNA encoding a beta-subunit of the avian H+/K(+)-ATPase was cloned from a chicken stomach cDNA library, and its nucleotide sequence determined. A comparison between all the available sequence data for the beta-subunits of P-type ATPases reveals several evolutionarily conserved regions. Overall identity was 66% when compared with mammalian H+/K(+)-ATPase beta-subunits, 34% identity when compared with the Na+/K(+)-ATPase beta 2-subunits, and 33% identity when compared with the Na+/K(+)-ATPase beta 1-subunits.


Assuntos
ATPase Trocadora de Hidrogênio-Potássio/química , Sequência de Aminoácidos , Animais , Sequência de Bases , Galinhas , DNA Complementar/análise , Dados de Sequência Molecular , Alinhamento de Sequência
3.
Trends Pharmacol Sci ; 12(2): 62-7, 1991 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-2024290

RESUMO

Three 'alpha 1-adrenoceptors' and three 'alpha 2-adrenoceptors' have now been cloned. How closely do these receptors match the native receptors that have been identified pharmacologically? What are the properties of these receptors, and how do they relate to other members of the cationic amine receptor family? Kevin Lynch and his colleagues discuss these questions in this review.


Assuntos
Receptores Adrenérgicos , Sequência de Aminoácidos , Animais , Clonagem Molecular , Humanos , Dados de Sequência Molecular , Receptores Adrenérgicos/análise , Receptores Adrenérgicos/classificação , Receptores Adrenérgicos/metabolismo
4.
J Mol Biol ; 276(1): 71-84, 1998 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-9514730

RESUMO

The FASTA package of sequence comparison programs has been modified to provide accurate statistical estimates for local sequence similarity scores with gaps. These estimates are derived using the extreme value distribution from the mean and variance of the local similarity scores of unrelated sequences after the scores have been corrected for the expected effect of library sequence length. This approach allows accurate estimates to be calculated for both FASTA and Smith-Waterman similarity scores for protein/protein, DNA/DNA, and protein/translated-DNA comparisons. The accuracy of the statistical estimates is summarized for 54 protein families using FASTA and Smith-Waterman scores. Probability estimates calculated from the distribution of similarity scores are generally conservative, as are probabilities calculated using the Altschul-Gish lambda, kappa, and eta parameters. The performance of several alternative methods for correcting similarity scores for library-sequence length was evaluated using 54 protein superfamilies from the PIR39 database and 110 protein families from the Prosite/SwissProt rel. 34 database. Both regression-scaled and Altschul-Gish scaled scores perform significantly better than unscaled Smith-Waterman or FASTA similarity scores. When the Prosite/ SwissProt test set is used, regression-scaled scores perform slightly better; when the PIR database is used, Altschul-Gish scaled scores perform best. Thus, length-corrected similarity scores improve the sensitivity of database searches. Statistical parameters that are derived from the distribution of similarity scores from the thousands of unrelated sequences typically encountered in a database search provide accurate estimates of statistical significance that can be used to infer sequence homology.


Assuntos
Homologia de Sequência , Software , Animais , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Humanos , Camundongos , Análise de Regressão , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico
5.
J Mol Biol ; 291(4): 977-95, 1999 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-10452901

RESUMO

The relationship between sequence similarity and structural similarity has been examined in 36 protein families with five or more diverse members whose structures are known. The structural similarity within a family (as determined with the DALI structure comparison program) is linearly related to sequence similarity (as determined by a Smith-Waterman search of the protein sequences in the structure database). The correlation between structural similarity and sequence similarity is very high; 18 of the 36 families had linear correlation coefficients r>/=0.878, and only nine had correlation coefficients r

Assuntos
Evolução Molecular , Proteínas/química , Proteínas/genética , Sequência de Aminoácidos , Animais , Mutação , Conformação Proteica , Dobramento de Proteína , Proteínas/classificação , Análise de Regressão , Homologia de Sequência de Aminoácidos
6.
Protein Sci ; 4(6): 1145-60, 1995 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-7549879

RESUMO

We have compared commonly used sequence comparison algorithms, scoring matrices, and gap penalties using a method that identifies statistically significant differences in performance. Search sensitivity with either the Smith-Waterman algorithm or FASTA is significantly improved by using modern scoring matrices, such as BLOSUM45-55, and optimized gap penalties instead of the conventional PAM250 matrix. More dramatic improvement can be obtained by scaling similarity scores by the logarithm of the length of the library sequence (In()-scaling). With the best modern scoring matrix (BLOSUM55 or JO93) and optimal gap penalties (-12 for the first residue in the gap and -2 for additional residues), Smith-Waterman and FASTA performed significantly better than BLASTP. With In()-scaling and optimal scoring matrices (BLOSUM45 or Gonnet92) and gap penalties (-12, -1), the rigorous Smith-Waterman algorithm performs better than either BLASTP and FASTA, although with the Gonnet92 matrix the difference with FASTA was not significant. Ln()-scaling performed better than normalization based on other simple functions of library sequence length. Ln()-scaling also performed better than scores based on normalized variance, but the differences were not statistically significant for the BLOSUM50 and Gonnet92 matrices. Optimal scoring matrices and gap penalties are reported for Smith-Waterman and FASTA, using conventional or In()-scaled similarity scores. Searches with no penalty for gap extension, or no penalty for gap opening, or an infinite penalty for gaps performed significantly worse than the best methods. Differences in performance between FASTA and Smith-Waterman were not significant when partial query sequences were used. However, the best performance with complete query sequences was obtained with the Smith-Waterman algorithm and In()-scaling.


Assuntos
Algoritmos , Bases de Dados Factuais , Proteínas/genética , Alinhamento de Sequência/métodos , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Aminoácidos/química , Estudos de Avaliação como Assunto , Probabilidade , Análise de Regressão
7.
Protein Sci ; 3(3): 525-7, 1994 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-8019423

RESUMO

Although macrophage migration inhibitory factor (MIF) proteins conjugate glutathione, sequence analysis does not support their homology to other glutathione transferases. Glutathione transferases are not detected with MIF proteins in searches of protein sequence databases, and MIF proteins do not share significant sequence similarity with glutathione transferases. Homology cannot be demonstrated by multiple sequence alignment or evolutionary tree construction; such methods assume that the proteins being analyzed are homologous.


Assuntos
Glutationa Transferase/genética , Fatores Inibidores da Migração de Macrófagos/genética , Animais , Evolução Biológica , Bases de Dados Factuais , Humanos , Oxirredutases Intramoleculares , Camundongos , Proteínas/genética , Ratos , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
8.
Pharmacogenetics ; 3(4): 167-81, 1993 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-8220436

RESUMO

Levels of mRNAs encoding class-alpha glutathione transferases, class-mu glutathione transferases, quinone reductase, and cytochrome P450 1A were measured after xenobiotic induction in murine tissues and in the Hepa1c1c7 murine hepatoma cell line. RNA levels in liver and intestinal mucosa were determined after induction with phenobarbital, butylated hydroxyanisole, beta-naphthoflavone, isosafrole, or combinations of these compounds. The tissue culture cells were presented with combinations of butylated hydroxyanisole, tert-butyl-hydroquinone, and beta-naphthoflavone. In murine liver and intestinal mucosa, the greatest induction (5-15-fold) of glutathione transferases and quinone reductase was seen with butylated hydroxyanisole. Administration of phenobarbital or beta-naphthoflavone has only a modest effect (2-3-fold). In contrast, cytochrome P450 1A mRNA levels increase only slightly after BHA induction but are induced dramatically by beta-naphthoflavone. The pattern of induction is different in Hepa1c1c7 cells; there the greatest induction of all mRNAs occurred with beta-naphthoflavone. Administration of antioxidants with other xenobiotics increases mRNA levels only slightly over the levels obtained with BHA in murine tissues, or with beta-naphthoflavone in Hepa1c1c7 cells. mGSTM1 (GT8.7, Yb1), the most abundant glutathione transferase mRNA in murine liver, is also the most abundant glutathione transferase mRNA in both normal and induced Hepa1c1c7 cells. Our results suggest that BHA induction in murine liver and intestinal mucosa of class-mu and class-alpha glutathione transferases may involve regulatory elements and mediators that function poorly in Hepa1c1c7 cells.


Assuntos
Sistema Enzimático do Citocromo P-450/biossíntese , Glutationa Transferase/biossíntese , NAD(P)H Desidrogenase (Quinona)/biossíntese , Oxirredutases/biossíntese , RNA Mensageiro/metabolismo , Xenobióticos/farmacologia , Animais , Benzoflavonas/farmacologia , Hidroxianisol Butilado/farmacologia , Citocromo P-450 CYP1A1 , Sistema Enzimático do Citocromo P-450/genética , Indução Enzimática , Feminino , Glutationa Transferase/genética , Mucosa Intestinal/enzimologia , Fígado/enzimologia , Camundongos , NAD(P)H Desidrogenase (Quinona)/genética , Oxirredutases/genética , Fenobarbital/farmacologia , Células Tumorais Cultivadas , beta-Naftoflavona
9.
Methods Enzymol ; 183: 63-98, 1990.
Artigo em Inglês | MEDLINE | ID: mdl-2156132

RESUMO

The FASTA program can search the NBRF protein sequence library (2.5 million residues) in less than 20 min on an IBM-PC microcomputer and unambiguously detect proteins that shared a common ancestor billions of years in the past. FASTA is both fast and selective because it initially considers only amino acid identities. Its sensitivity is increased not only by using the PAM250 matrix to score and rescore regions with large numbers of identities but also by joining initial regions. The results of searches with FASTA compare favorably with results using NWS-based programs that are 100 times slower. FASTA is slightly less sensitive but considerably more selective. It is not clear that NWS-based programs would be more successful in finding distantly related members of the G-protein-coupled receptor family. The joining step by FASTA to calculate the initn score is especially useful for sequences that share regions of sequence similarity that are separated by variable-length loops. FASTP and FASTA were designed to identify protein sequences that have descended from a common ancestor, and they have proved very useful for this task. In many cases, a FASTA sequence search will result in a list of high scoring library sequences that are homologous to the query sequence, or the search will result in a list of sequences with similarity scores that cannot be distinguished from the bulk of the library. In either case, the question of whether there are sequences in the library that are clearly related to the query sequence has been answered unambiguously. Unfortunately, the results often will not be so clear-cut, and careful analysis of similarity scores, statistical significance, the actual aligned residues, and the biological context are required. In the course of analyzing the G-protein-coupled receptor family, several proteins were found that, because of a high initn score and a low init1 score that increased almost 2-fold with optimization, appeared to be members of this family which were not previously recognized. RDF2 analysis showed borderline z values, and only a careful examination of the sequence alignments that focused on the conserved residues provided convincing evidence that the high scores were fortuitous. As sequence comparison methods become more powerful by becoming more sensitive, they become more likely to mislead, and even greater care is required.


Assuntos
Sequência de Aminoácidos , Sequência de Bases , DNA/genética , Biblioteca Gênica , Sistemas de Informação , Proteínas/genética , Homologia de Sequência do Ácido Nucleico , Algoritmos , Animais , Proteínas do Olho/genética , Dados de Sequência Molecular , Receptores Adrenérgicos beta/genética , Opsinas de Bastonetes , Software
10.
Methods Enzymol ; 266: 227-58, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-8743688

RESUMO

Although there are several different comparison programs available (e.g., BLASTP, FASTA, SSEARCH, and BLITZ) that can be used with different scoring systems (e.g., PAM120, PAM250, BLOSUM50, BLOSUM62) and different databases (e.g., PIR, SWISS-PROT, GenPept), the following search protocol should identify homologous sequences whenever they can be found. 1. Always compare protein sequences if the genes encode proteins. Protein sequence comparison will typically double the evolutionary lookback time over DNA sequence comparison. 2. Search several sequence databases using a rapid sequence comparison program (e.g., BLASTP or FASTA, ktup = 2). Well-curated databases like PIR or SWISS-PROT tend to have fewer redundant sequences, which improves the statistical significance of a match, but they are less comprehensive and up-to-date than GenPept. 3. If there is good agreement between the distribution of scores and the theoretical distribution, and the alignments do not include "simple sequence" domains, accept sequences with FASTA E() values or BLASTP P() values below 0.02 as homologous. 4. If no library sequences are found with E values below 0.02, perform additional searches with FASTA, ktup = 1, or SSEARCH. If library sequences with E values less than 0.02 are found, the sequences are probably homologous, unless a low-complexity domain is aligned. However, sequences with similarity scores from 0.02 to 10.0 may be homologous as well. To characterize these more distantly related sequences, select "marginal" library sequences and use them to search the databases. Additional family members should have E values less than 0.05. 5. Homologous sequences share a common ancestor, and thus a common protein fold. Depending on the evolutionary distance and divergence path, two or more homologous sequences may have very few absolutely conserved residues. However, if homology has been inferred between A and B, between B and C, and between C and D, A and D must be homologous, even if they share no significant similarity. 6. Sequences with marginal E values should also be tested using the PRSS program. Compare the query and library sequences using at least 200 (and preferably 1000) shuffles. Shuffles using a window (-w) of 10-20 are more stringent than a uniform shuffle. Use the E value after 1000 shuffles to confirm an inference of homology. 7. Homologous sequences are usually similar over an entire sequence or domain, typically sharing 20-25% or greater identity for more than 200 residues. Matches that are more than 50% identical in a 20- to 40-amino acid region occur frequently by chance and do not indicate homology. By following these steps, one will very rarely assert that two sequences are homologous when in fact they are not. However, these criteria are stringent; distantly related homologous sequences may fail to be detected because their similarity is not statistically significant. These tests are biased toward missing some distantly related sequences to avoid the possibility of misidentifying unrelated ones. In most database searches, the ratio of related to unrelated sequences is more than 4000:1 (e.g., 10 related and 40,000 unrelated sequences). Thus, one is more likely to mistakenly identify two sequences as related than to overlook a genuine relationship, and our conservative evaluation criteria reflect that bias.


Assuntos
Sequência de Aminoácidos , Bases de Dados Factuais , Proteínas/química , Proteínas/genética , Homologia de Sequência de Aminoácidos , Software , Animais , Calmodulina/genética , Drosophila , Glutationa Transferase/genética , Humanos , Isoenzimas/genética , Camundongos , Dados de Sequência Molecular , Fator 1 de Elongação de Peptídeos , Fatores de Alongamento de Peptídeos/genética , Probabilidade , Ratos , Análise de Regressão , Sensibilidade e Especificidade
11.
Methods Enzymol ; 210: 575-601, 1992.
Artigo em Inglês | MEDLINE | ID: mdl-1584052

RESUMO

Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.


Assuntos
Algoritmos , Ácidos Nucleicos/química , Proteínas/química , Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Bases de Dados Factuais , Computação Matemática , Dados de Sequência Molecular , Software
12.
J Comput Biol ; 4(3): 339-49, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-9278064

RESUMO

We develop several algorithms for the problem of aligning DNA sequence with a protein sequence. Our methods account for frameshift errors, but not for introns in the DNA sequence. Thus, they are particularly appropriate for comparing a cDNA sequence that suffers from sequencing errors with an amino acid sequence or a protein sequence database. We describe algorithms for computing optimal alignments for several definitions of DNA-protein alignment, verify sufficient conditions for equivalence of certain definitions, describe techniques for efficient implementation, and discuss experience with these ideas in a new release of the FASTA suite of database-searching programs.


Assuntos
Algoritmos , DNA/química , Proteínas/química , Alinhamento de Sequência/métodos , Bases de Dados Factuais , Análise de Sequência de DNA , Software
13.
Ann N Y Acad Sci ; 389: 106-15, 1982.
Artigo em Inglês | MEDLINE | ID: mdl-6953913

RESUMO

The concentration of serum amyloid A polypeptide (SAAL) increases greatly during the acute phase responses to infection or inflammation. We find that SAAL synthesis comprises 2.5% of murine hepatic protein synthesis after lipopolysaccharide (LPS) administration, but much less in normal liver. SAAL messenger RNA (mRNA) in liver increases at least 500-fold above the normal level. A recombinant plasmid homologous to SAAL mRNA has been isolated, as has most of the mouse genome DNA encoding the plasmid's nucleotide sequence. This gene is transcribed into RNA much more frequently after LPS administration than it is in normal liver. In a number of other mammalian genes, cytosine methylation is inversely related to the rate of transcription. Methylation of CCGG sequences in hepatic DNA homologous to the recombinant plasmid has been examined. Little or no change is found after LPS administration. This suggests that other factors are responsible for the increase in SAAL mRNA in the acute phase response.


Assuntos
Amiloide/biossíntese , Proteína Amiloide A Sérica/biossíntese , Sequência de Aminoácidos , Animais , Sequência de Bases , DNA/metabolismo , Técnicas In Vitro , Fígado/metabolismo , Camundongos , Camundongos Endogâmicos BALB C , Hibridização de Ácido Nucleico , RNA Mensageiro/metabolismo , Recombinação Genética , Proteína Amiloide A Sérica/genética , Transcrição Gênica
14.
Methods Mol Biol ; 132: 185-219, 2000.
Artigo em Inglês | MEDLINE | ID: mdl-10547837

RESUMO

The FASTA3 and FASTA2 packages provide a flexible set of sequence-comparison programs that are particularly valuable because of their accurate statistical estimates and high-quality alignments. Traditionally, sequence similarity searches have sought to ask one question: "Is my query sequence homologous to anything in the database?" Both FASTA and BLAST can provide reliable answers to this question with their statistical estimates; if the expectation value E is < 0.001-0.01 and you are not doing hundreds of searches a day, the answer is probably yes. In general, the most effective search strategies follow these rules: 1. Whenever possible, compare at the amino acid level, rather than the nucleotide level. Search first with protein sequences (blastp, fasta3, and ssearch3), then with translated DNA sequences (fastx, blastx), and only at the DNA level as a last resort (Table 5). 2. Search the smallest database that is likely to contain the sequence of interest (but it must contain many unrelated sequences for accurate statistical estimates). 3. Use sequence statistics, rather than percent identity or percent similarity, as your primary criterion for sequence homology. 4. Check that the statistics are likely to be accurate by looking for the highest-scoring unrelated sequence, using prss3 to confirm the expectation, and searching with shuffled copies of the query sequence [randseq, searches with shuffled sequences should have E approx 1.0]. 5. Consider searches with different gap penalties and other scoring matrices. Searches with long query sequences against full-length sequence libraries will not change dramatically when BLOSUM62 is used instead of BLOSUM50 (20), or a gap penalty of -14/-2 is used in place of -12/-2. However, shallower or more stringent scoring matrices are more effective at uncovering relationships in partial sequences (3,18), and they can be used to sharpen dramatically the scope of the similarity search. However, as illustrated in the last section, the E value is only the first step in characterizing a sequence relationship. Once one has confidence that the sequences are homologous, one should look at the sequence alignments and percent identities, particularly when searching with lower quality sequences. When sequence alignments are very short, the alignment should become more significant when a shallower scoring matrix is used, e.g., BLOSUM62 rather than BLOSUM50 (remember to change the gap penalties). Homology can be reliably inferred from statistically significant similarity. Whereas homology implies common three-dimensional structure, homology need not imply common function. Orthologous sequences usually have similar functions, but paralogous sequences often acquire very different functional roles. Motif databases, such as PROSITE (21), can provide evidence for the conservation of critical functional residues. However, motif identity in the absence of overall sequence similarity is not a reliable indicator of homology.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação , Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Evolução Molecular , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos
19.
Genomics ; 11(3): 635-50, 1991 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-1774068

RESUMO

The sensitivity and selectivity of the FASTA and the Smith-Waterman protein sequence comparison algorithms were evaluated using the superfamily classification provided in the National Biomedical Research Foundation/Protein Identification Resource (PIR) protein sequence database. Sequences from each of the 34 superfamilies in the PIR database with 20 or more members were compared against the protein sequence database. The similarity scores of the related and unrelated sequences were determined using either the FASTA program or the Smith-Waterman local similarity algorithm. These two sets of similarity scores were used to evaluate the ability of the two comparison algorithms to identify distantly related protein sequences. The FASTA program using the ktup = 2 sensitivity setting performed as well as the Smith-Waterman algorithm for 19 of the 34 superfamilies. Increasing the sensitivity by setting ktup = 1 allowed FASTA to perform as well as Smith-Waterman on an additional 7 superfamilies. The rigorous Smith-Waterman method performed better than FASTA with ktup = 1 on 8 superfamilies, including the globins, immunoglobulin variable regions, calmodulins, and plastocyanins. Several strategies for improving the sensitivity of FASTA were examined. The greatest improvement in sensitivity was achieved by optimizing a band around the best initial region found for every library sequence. For every superfamily except the globins and immunoglobulin variable regions, this strategy was as sensitive as a full Smith-Waterman. For some sequences, additional sensitivity was achieved by including conserved but nonidentical residues in the lookup table used to identify the initial region.


Assuntos
Algoritmos , Sequência de Aminoácidos , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Proteínas/classificação , Biblioteca Gênica , Dados de Sequência Molecular , Sensibilidade e Especificidade , Alinhamento de Sequência , Software
20.
Nucleic Acids Res ; 10(1): 217-27, 1982 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-6278405

RESUMO

A computer program is described which constructs maps of restriction endonuclease cleavage sites in DNA molecules, given only the fragment lengths. The program utilizes fragment length data from single and double restriction enzyme digests to generate maps for linear or circular molecules. The search for a map can be limited to the unknown (insert) region of a recombinant phage or plasmid. Typical restriction maps with four or five enzymes which cut at three to five unknown sites can be calculated in a few minutes.


Assuntos
Sequência de Bases , Computadores , Enzimas de Restrição do DNA/metabolismo , DNA , Autoanálise , Métodos , Especificidade por Substrato
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA