Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Br J Cancer ; 100(9): 1452-64, 2009 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-19401702

RESUMO

Tumour stroma gene expression in biopsy specimens may obscure the expression of tumour parenchyma, hampering the predictive power of microarrays. We aimed to assess the utility of fluorescence-activated cell sorting (FACS) for generating cell populations for gene expression analysis and to compare the gene expression of FACS-purified tumour parenchyma to that of whole tumour biopsies. Single cell suspensions were generated from colorectal tumour biopsies and tumour parenchyma was separated using FACS. Fluorescence-activated cell sorting allowed reliable estimation and purification of cell populations, generating parenchymal purity above 90%. RNA from FACS-purified and corresponding whole tumour biopsies was hybridised to Affymetrix oligonucleotide microarrays. Whole tumour and parenchymal samples demonstrated differential gene expression, with 289 genes significantly overexpressed in the whole tumour, many of which were consistent with stromal gene expression (e.g., COL6A3, COL1A2, POSTN, TIMP2). Genes characteristic of colorectal carcinoma were overexpressed in the FACS-purified cells (e.g., HOX2D and RHOB). We found FACS to be a robust method for generating samples for gene expression analysis, allowing simultaneous assessment of parenchymal and stromal compartments. Gross stromal contamination may affect the interpretation of cancer gene expression microarray experiments, with implications for hypotheses generation and the stability of expression signatures used for predicting clinical outcomes.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Regulação Neoplásica da Expressão Gênica , Células Estromais/patologia , Biópsia , Moléculas de Adesão Celular/genética , Separação Celular/métodos , Colágeno/genética , Colágeno Tipo I , Colágeno Tipo VI/genética , Citometria de Fluxo , Perfilação da Expressão Gênica/métodos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , RNA Neoplásico/genética , RNA Neoplásico/isolamento & purificação , Inibidor Tecidual de Metaloproteinase-2/genética
2.
Bioinformatics ; 23(21): 2947-8, 2007 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-17846036

RESUMO

SUMMARY: The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems. AVAILABILITY: The programs can be run on-line from the EBI web server: http://www.ebi.ac.uk/tools/clustalw2. The source code and executables for Windows, Linux and Macintosh computers are available from the EBI ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/


Assuntos
Algoritmos , Gráficos por Computador , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Interface Usuário-Computador , Sequência de Aminoácidos , Análise por Conglomerados , Dados de Sequência Molecular , Linguagens de Programação
5.
Comp Funct Genomics ; 3(3): 244-53, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-18628857

RESUMO

Accumulating evidence indicates an important role for non-coding RNA molecules in eukaryotic cell regulation. A small number of coding and non-coding overlapping antisense transcripts (OATs) in eukaryotes have been reported, some of which regulate expression of the corresponding sense transcript. The prevalence of this phenomenon is unknown, but there may be an enrichment of such transcripts at imprinted gene loci. Taking a bioinformatics approach, we systematically searched a human mRNA database (RefSeq) for complementary regions that might facilitate pairing with other transcripts. We report 56 pairs of overlapping transcripts, in which each member of the pair is transcribed from the same locus. This allows us to make an estimate of 1000 for the minimum number of such transcript pairs in the entire human genome. This is a surprisingly large number of overlapping gene pairs and, clearly, some of the overlaps may not be functionally significant. Nonetheless, this may indicate an important general role for overlapping antisense control in gene regulation. EST databases were also investigated in order to address the prevalence of cases of imprinted genes with associated non-coding overlapping, antisense transcripts. However, EST databases were found to be completely inappropriate for this purpose.

7.
J Mol Biol ; 302(1): 205-17, 2000 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-10964570

RESUMO

We describe a new method (T-Coffee) for multiple sequence alignment that provides a dramatic improvement in accuracy with a modest sacrifice in speed as compared to the most commonly used alternatives. The method is broadly based on the popular progressive approach to multiple alignment but avoids the most serious pitfalls caused by the greedy nature of this algorithm. With T-Coffee we pre-process a data set of all pair-wise alignments between the sequences. This provides us with a library of alignment information that can be used to guide the progressive alignment. Intermediate alignments are then based not only on the sequences to be aligned next but also on how all of the sequences align with each other. This alignment information can be derived from heterogeneous sources such as a mixture of alignment programs and/or structure superposition. Here, we illustrate the power of the approach by using a combination of local and global pair-wise alignments to generate the library. The resulting alignments are significantly more reliable, as determined by comparison with a set of 141 test cases, than any of the popular alternatives that we tried. The improvement, especially clear with the more difficult test cases, is always visible, regardless of the phylogenetic spread of the sequences in the tests.


Assuntos
Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Bases de Dados como Assunto , Humanos , Dados de Sequência Molecular , Proteínas Serina-Treonina Quinases/química , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Homologia de Sequência de Aminoácidos , Software
9.
Bioinformatics ; 15(4): 341-2, 1999 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-10320403

RESUMO

SUMMARY: MIAH is a WWW server for the automatic alignment of new eukaryotic SSU rRNA sequences to an existing alignment of 1500 sequences. AVAILABILITY: http://chah.ucc.ie/MIAH Contact :


Assuntos
RNA Ribossômico/análise , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Automação , Células Eucarióticas
10.
Gene ; 232(1): 11-23, 1999 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-10333517

RESUMO

The family of regulatory and structural muscle proteins, which includes the giant kinases titin, twitchin and projectin, has sequences composed predominantly of serially linked immunoglobulin I set (Ig) and fibronectin type III (FN3) domains. This paper explores the evolutionary relationships between 16 members of this family. In titin, groups of Ig and FN3 domains are arranged in a regularly repeating pattern of seven and 11 domains. The 11-domain super-repeat has its origins in the seven-domain super-repeat and a model for the duplications which gave rise to this super-repeat is proposed. A super-repeat composed solely of immunoglobulin domains is found in the skeletal muscle isoform of titin. Twitchin and projectin, which are presumed to be orthologs, have undergone significant insertion/deletion of domains since their divergence. The common ancestry of myomesin, skelemin and M-protein is shown. The relationship between myosin binding proteins (MyBPs) C and H is confirmed, and MyBP-H is proposed to have given rise to MyBP-C by the acquisition of some titin domains.


Assuntos
Evolução Molecular , Fibronectinas/genética , Imunoglobulinas/genética , Proteínas Musculares/genética , Proteínas Quinases/genética , Sequência de Aminoácidos , Animais , Proteínas de Caenorhabditis elegans , Proteínas de Ligação a Calmodulina/química , Proteínas de Ligação a Calmodulina/genética , Conectina , Fibronectinas/química , Humanos , Imunoglobulinas/química , Proteínas Musculares/química , Filogenia , Proteínas Quinases/química
11.
Mol Cell Probes ; 12(6): 397-405, 1998 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-9843657

RESUMO

Despite its widespread use, the molecular basis of random amplification is poorly understood. Here the basis of random amplification has been investigated by cloning and sequencing the products of a random amplification of polymorphic DNA (RAPD) amplification from Saccharomyces cerevisiae DNA. The genomic origin of the amplified products was determined by sequence comparison with the S. cerevisiae Genome Database (SGD). This allowed analysis of the degree of identity between the random primer and the primer binding sites on the genome. There was no relationship between RAPD size, GC content and relative abundance. The degree of matching between the primer and the primer binding sites increased towards the 3; end of the primer and decreased towards the 5; end. The maximum number of mismatches observed between primer and primer binding sites was never more than one between positions 1-7 of the primer. Nucleotide compositional biases were also observed upstream and downstream of the primer binding site with a marked preference for AT richness upstream of the primer binding sites and for a GC preference directly following the 3; end of the primer. These findings have important ramifications for primer design for multiplex, low stringency and degenerate polymerase chain reaction (PCR).


Assuntos
DNA Fúngico/genética , Genoma Fúngico , Técnica de Amplificação ao Acaso de DNA Polimórfico , Saccharomyces cerevisiae/genética , Sequência de Bases , Sítios de Ligação , Clonagem Molecular , Primers do DNA , DNA Fúngico/química , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Saccharomyces cerevisiae/química , Alinhamento de Sequência , Análise de Sequência de DNA
13.
Bioinformatics ; 14(5): 407-22, 1998 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-9682054

RESUMO

MOTIVATION: In order to increase the accuracy of multiple sequence alignments, we designed a new strategy for optimizing multiple sequence alignments by genetic algorithm. We named it COFFEE (Consistency based Objective Function For alignmEnt Evaluation). The COFFEE score reflects the level of consistency between a multiple sequence alignment and a library containing pairwise alignments of the same sequences. RESULTS: We show that multiple sequence alignments can be optimized for their COFFEE score with the genetic algorithm package SAGA. The COFFEE function is tested on 11 test cases made of structural alignments extracted from 3D_ali. These alignments are compared to those produced using five alternative methods. Results indicate that COFFEE outperforms the other methods when the level of identity between the sequences is low. Accuracy is evaluated by comparison with the structural alignments used as references. We also show that the COFFEE score can be used as a reliability index on multiple sequence alignments. Finally, we show that given a library of structure-based pairwise sequence alignments extracted from FSSP, SAGA can produce high-quality multiple sequence alignments. The main advantage of COFFEE is its flexibility. With COFFEE, any method suitable for making pairwise alignments can be extended to making multiple alignments. AVAILABILITY: The package is available along with the test cases through the WWW: http://www. ebi.ac.uk/cedric CONTACT: cedric.notredame@ebi.ac.uk


Assuntos
Algoritmos , Alinhamento de Sequência/métodos , Software , Sequência de Aminoácidos , Biologia Computacional , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Fibronectinas/genética , Modelos Genéticos , Dados de Sequência Molecular , Alinhamento de Sequência/estatística & dados numéricos , Homologia de Sequência de Aminoácidos
14.
Bioinformatics ; 14(4): 332-41, 1998.
Artigo em Inglês | MEDLINE | ID: mdl-9632828

RESUMO

MOTIVATION: Large alignments of ribosomal RNA sequences are maintained at various sites. New sequences are added to these alignments using a combination of manual and automatic methods. We examine the use of profile alignment methods for rRNA alignment and try to optimize the choice of parameters and sequence weights. RESULTS: Using a large alignment of eukaryotic SSU rRNA sequences as a test case, we empirically compared the performance of various sequence weighting schemes over a range of gap penalties. We developed a new weighting scheme which gives most weight to the sequences in the profile that are most similar to the new sequence. We show that it gives the most accurate alignments when combined with a more traditional sequence weighting scheme. AVAILABILITY: The source code of all software is freely available by anonymous ftp from chah.ucc.ie in the directory /home/ftp/pub/emmet,in the compressed file PRNAA.tar: CONTACT: emmet@chah.ucc.ie, des@chah.ucc.ie


Assuntos
RNA Ribossômico/química , Alinhamento de Sequência , Software , Animais , Humanos , Estrutura Secundária de Proteína
15.
Bioinformatics ; 14(10): 830-8, 1998.
Artigo em Inglês | MEDLINE | ID: mdl-9927711

RESUMO

MOTIVATION: The automatic alignment of rRNA sequences can reproduce manual expert alignments with high, but not perfect, fidelity. We examine the use of empirical methods for the identification of regions of an alignment of a new sequence with an existing large alignment which can confidently be predicted to be correctly aligned. RESULTS: We show how to use a simple jack-knife procedure to derive an estimate of the reliability that is to be expected at each position of a large alignment of eukaryotic rRNA sequences. These reliabilities are then improved using measures that are specific to the input sequence. Regions where the sequence-specific reliability method performs particularly well are identified and seen to correspond with elements in the structure of the rRNA molecules that vary between species in the alignment. We also compare these reliability measures to an algorithmic alignment stability measure. AVAILABILITY: The software is available free of charge by sending an e-mail message to emmet@chah.ucc.ie. CONTACT: emmet@chah.ucc.ie


Assuntos
RNA Ribossômico/genética , Alinhamento de Sequência/estatística & dados numéricos , Algoritmos , Animais , Sequência de Bases , Biologia Computacional , Humanos , RNA Fúngico/genética , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética , Alinhamento de Sequência/métodos , Homologia de Sequência do Ácido Nucleico
16.
Nucleic Acids Res ; 25(24): 4876-82, 1997 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-9396791

RESUMO

CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.


Assuntos
Alinhamento de Sequência , Interface Usuário-Computador , Algoritmos , Sequência de Aminoácidos , Apresentação de Dados , Dados de Sequência Molecular , Homologia de Sequência do Ácido Nucleico
17.
Nucleic Acids Res ; 25(22): 4570-80, 1997 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-9358168

RESUMO

We describe a new approach for accurately aligning two homologous RNA sequences when the secondary structure of one of them is known. To do so we developed two software packages, called RAGA and PRAGA, which use a genetic algorithm approach to optimize the alignments. RAGA is mainly an extension of SAGA, an earlier package for multiple protein sequence alignment. In PRAGA several genetic algorithms run in parallel and exchange individual solutions. This method allows us to optimize an objective function that describes the quality of a RNA pairwise alignment, taking into account both primary and secondary structure, including pseudoknots. We report results obtained using PRAGA on nine test cases of pairs of eukaryotic small subunit rRNA sequence (nuclear and mitochondrial).


Assuntos
Algoritmos , RNA Ribossômico/genética , RNA/genética , Alinhamento de Sequência/métodos , Software , Animais , Sequência de Bases , Estudos de Avaliação como Assunto , Humanos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , RNA Mitocondrial
18.
Nucleic Acids Res ; 24(8): 1515-24, 1996 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-8628686

RESUMO

We describe a new approach to multiple sequence alignment using genetic algorithms and an associated software package called SAGA. The method involves evolving a population of alignments in a quasi evolutionary manner and gradually improving the fitness of the population as measured by an objective function which measures multiple alignment quality. SAGA uses an automatic scheduling scheme to control the usage of 22 different operators for combining alignments or mutating them between generations. When used to optimise the well known sums of pairs objective function, SAGA performs better than some of the widely used alternative packages. This is seen with respect to the ability to achieve an optimal solution and with regard to the accuracy of alignment by comparison with reference alignments based on sequences of known tertiary structure. The general attraction of the approach is the ability to optimise any objective function that one can invent.


Assuntos
Algoritmos , Alinhamento de Sequência , Software , Sequência de Aminoácidos , Dados de Sequência Molecular
19.
Methods Enzymol ; 266: 383-402, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-8743695

RESUMO

We have tested CLUSTAL W in a wide variety of situations, and it is capable of handling some very difficult protein alignment problems. If the data set consists of enough closely related sequences so that the first alignments are accurate, then CLUSTAL W will usually find an alignment that is very close to ideal. Problems can still occur if the data set includes sequences of greatly different lengths or if some sequences include long regions that are impossible to align with the rest of the data set. Trying to balance the need for long insertions and deletions in some alignments with the need to avoid them in others is still a problem. The default values for our parameters were tested empirically using test cases of sets of globular proteins where some information as to the correct alignment was available. The parameter values may not be very appropriate with nonglobular proteins. We have argued that using one weight matrix and two gap penalties is too simplistic to be of general use in the most difficult cases. We have replaced these parameters with a large number of new parameters designed primarily to help encourage gaps in loop regions. Although these new parameters are largely heuristic in nature, they perform surprisingly well and are simple to implement. The underlying speed of the progressive alignment approach is not adversely affected. The disadvantage is that the parameter space is now huge; the number of possible combinations of parameters is more than can easily be examined by hand. We justify this by asking the user to treat CLUSTAL W as a data exploration tool rather than as a definitive analysis method. It is not sensible to automatically derive multiple alignments and to trust particular algorithms as being capable of always getting the correct answer. One must examine the alignments closely, especially in conjunction with the underlying phylogenetic tree (or estimate of it) and try varying some of the parameters. Outliers (sequences that have no close relatives) should be aligned carefully, as should fragments of sequences. The program will automatically delay the alignment of any sequences that are less than 40% identical to any others until all other sequences are aligned, but this can be set from a menu by the user. It may be useful to build up an alignment of closely related sequences first and to then add in the more distant relatives one at a time or in batches, using the profile alignments and weighting scheme described earlier and perhaps using a variety of parameter settings. We give one example using SH2 domains. SH2 domains are widespread in eukaryotic signalling proteins where they function in the recognition of phosphotyrosine-containing peptides. In the chapter by Bork and Gibson ([11], this volume), Blast and pattern/profile searches were used to extract the set of known SH2 domains and to search for new members. (Profiles used in database searches are conceptually very similar to the profiles used in CLUSTAL W: see the chapters [11] and [13] for profile search methods.) The profile searches detected SH2 domains in the JAK family of protein tyrosine kinases, which were thought not to contain SH2 domains. Although the JAK family SH2 domains are rather divergent, they have the necessary core structural residues as well as the critical positively charged residue that binds phosphotyrosine, leaving no doubt that they are bona fide SH2 domains. The five new JAK family SH2 domains were added sequentially to the existing alignment of 65 SH2 domains using the CLUSTAL W profile alignment option. Figure 6 shows part of the resulting alignment. Despite their divergent sequences, the new SH2 domains have been aligned nearly perfectly with the old set. No insertions were placed in the original SH2 domains. In this example, the profile alignment procedure has produced better results than a one-step full alignment of all 70 SH2 domains, and in considerably less time. (ABSTRACT TRUNCATED)


Assuntos
Sequência de Aminoácidos , Sequência de Bases , DNA/química , Bases de Dados Factuais , Globinas/química , Filogenia , Proteínas/química , Software , Animais , Evolução Molecular , Fabaceae/genética , Globinas/genética , Cavalos , Humanos , Leghemoglobina/química , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Plantas Medicinais , Estrutura Secundária de Proteína , Proteínas Tirosina Quinases/química , Proteínas Tirosina Quinases/genética , Domínios de Homologia de src
20.
Protein Sci ; 4(8): 1587-95, 1995 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-8520485

RESUMO

We present a new method for the identification of conserved patterns in a set of unaligned related protein sequences. It is able to discover patterns of a quite general form, allowing for both ambiguous positions and for variable length wildcard regions. It allows the user to define a class of patterns (e.g., the degree of ambiguity allowed and the length and number of gaps), and the method is then guaranteed to find the conserved patterns in this class scoring highest according to a significance measure defined. Identified patterns may be refined using one of two new algorithms. We present a new (nonstatistical) significance measure for flexible patterns. The method is shown to recover known motifs for PROSITE families and is also applied to some recently described families from the literature.


Assuntos
Reconhecimento Automatizado de Padrão , Proteínas/química , Alinhamento de Sequência , Algoritmos , Sequência de Aminoácidos , Sequência Conservada , Dados de Sequência Molecular , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA