Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
BMC Bioinformatics ; 12: 18, 2011 Jan 13.
Artículo en Inglés | MEDLINE | ID: mdl-21226965

RESUMEN

BACKGROUND: As one of the most widely used parsimony methods for ancestral reconstruction, the Fitch method minimizes the total number of hypothetical substitutions along all branches of a tree to explain the evolution of a character. Due to the extensive usage of this method, it has become a scientific endeavor in recent years to study the reconstruction accuracies of the Fitch method. However, most studies are restricted to 2-state evolutionary models and a study for higher-state models is needed since DNA sequences take the format of 4-state series and protein sequences even have 20 states. RESULTS: In this paper, the ambiguous and unambiguous reconstruction accuracy of the Fitch method are studied for N-state evolutionary models. Given an arbitrary phylogenetic tree, a recurrence system is first presented to calculate iteratively the two accuracies. As complete binary tree and comb-shaped tree are the two extremal evolutionary tree topologies according to balance, we focus on the reconstruction accuracies on these two topologies and analyze their asymptotic properties. Then, 1000 Yule trees with 1024 leaves are generated and analyzed to simulate real evolutionary scenarios. It is known that more taxa not necessarily increase the reconstruction accuracies under 2-state models. The result under N-state models is also tested. CONCLUSIONS: In a large tree with many leaves, the reconstruction accuracies of using all taxa are sometimes less than those of using a leaf subset under N-state models. For complete binary trees, there always exists an equilibrium interval [a, b] of conservation probability, in which the limiting ambiguous reconstruction accuracy equals to the probability of randomly picking a state. The value b decreases with the increase of the number of states, and it seems to converge. When the conservation probability is greater than b, the reconstruction accuracies of the Fitch method increase rapidly. The reconstruction accuracies on 1000 simulated Yule trees also exhibit similar behaviors. For comb-shaped trees, the limiting reconstruction accuracies of using all taxa are always less than or equal to those of using the nearest root-to-leaf path when the conservation probability is not less than 1/N. As a result, more taxa are suggested for ancestral reconstruction when the tree topology is balanced and the sequences are highly similar, and a few taxa close to the root are recommended otherwise.


Asunto(s)
Algoritmos , Evolución Molecular , Filogenia , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína
2.
Biochem Biophys Res Commun ; 399(4): 470-4, 2010 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-20678477

RESUMEN

Protein multiple sequence alignment is an important bioinformatics tool. It has important applications in biological evolution analysis and protein structure prediction. A variety of alignment algorithms in this field have achieved great success. However, each algorithm has its own inherent deficiencies. In this paper, permutation similarity is proposed to evaluate several protein multiple sequence alignment algorithms that are widely used currently. As the permutation similarity method only concerns the relative order of different protein evolutionary distances, without taking into account the slight difference between the evolutionary distances, it can get more robust evaluations. The longest common subsequence method is adopted to define the similarity between different permutations. Using these methods, we assessed Dialign, Tcoffee, ClustalW and Muscle and made comparisons among them.


Asunto(s)
Algoritmos , Alineación de Secuencia , Análisis de Secuencia de Proteína , Programas Informáticos , Interpretación Estadística de Datos , Bases de Datos de Proteínas
3.
J Biomol Struct Dyn ; 24(3): 239-42, 2006 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-17054381

RESUMEN

Classification and prediction of protein domain structural class is one of the important topics in the molecular biology. We introduce the Bagging (Bootstrap aggregating), one of the bootstrap methods, for classifying and predicting protein structural classes. By a bootstrap aggregating procedure, the Bagging can improve a weak classifier, for instance the random tree method, to a significant step towards optimality. In this research, it is demonstrated that the Bagging performed at least as well as LogitBoost and Support vector machines in predicting the structural classes for a given protein domain dataset by 10 cross-validation test, which indicate that the Bagging method is promising and anticipated that it could be potentially further improved on predicting protein structural classes as well as other bio-macromolecular attributes, if the bagging method and other existing methods can be effectively complemented with each other.


Asunto(s)
Conformación Proteica , Proteínas/química , Proteínas/clasificación , Algoritmos , Modelos Moleculares
4.
Zhonghua Yi Xue Za Zhi ; 86(26): 1808-12, 2006 Jul 11.
Artículo en Zh | MEDLINE | ID: mdl-17054855

RESUMEN

OBJECTIVE: To investigate the potential genetic regulatory pathway of gene related breast cancer metastasis. METHODS: Microarray technique was used to identify the gene expression profile and to screen the differential expression genes in breast cancer with special emphasis on the metastasis factors. A gene chip was available, obtained from the surgical samples, including breast cancer primary tissues and metastasis tissues, of 30 female patients with breast cancer at different clinical stages. Then potential genetic regulatory pathway of gene related breast cancer metastasis was analyzed with a linear differential model and k-means clustering. RESULTS: Twenty-seven differential expression genes were identified. It was suggested that the potential regulatory pathway of gene related to breast cancer metastasis is made up of GRP, BPAG1, and SFRP2 genes. CONCLUSION: The metastasis of breast cancer is related to the cancerization caused by the abnormal expression of multiple genes. It is reliable to analyze the Genetic regulatory pathway of gene related breast cancer metastasis by using multiple bio-informatic measures.


Asunto(s)
Neoplasias de la Mama/genética , Perfilación de la Expresión Génica , Metástasis Linfática/genética , Neoplasias de la Mama/patología , Análisis por Conglomerados , Humanos , Modelos Lineales
5.
Conf Proc IEEE Eng Med Biol Soc ; 2005: 2836-8, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-17282833

RESUMEN

To compare large numbers of genomic sequences of related virus, such as HIV, biologists have an increasing need for a method that can efficiently handle hundreds, even thousands, of genomic sequences accurately enough to correctly align these conserved features. In this paper, we introduce a new and efficient tool named SMA that can easily accommodate large-scale virus genomic sequences. A high-throughput test on 706 HIV-1 genomic sequences shows that SMA is much faster than the available programs with at least the same performance. SMA is a good improvement of existing algorithms for high-volume multiple sequence alignment. It offers an option that provides improved speed and accuracy compared with currently available programs. SMA is freely available at http://mathbio.nankai.edu.cn/e_version/align_query.php.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA