A comparative analysis of HGSC and Celera human genome assemblies and gene sets.
Bioinformatics
; 19(13): 1597-605, 2003 Sep 01.
Article
in En
| MEDLINE
| ID: mdl-12967954
ABSTRACT
MOTIVATION Since the simultaneous publication of the human genome assembly by the International Human Genome Sequencing Consortium (HGSC) and Celera Genomics, several comparisons have been made of various aspects of these two assemblies. In this work, we set out to provide a more comprehensive comparative analysis of the two assemblies and their associated gene sets. RESULTS:
The local sequence content for both draft genome assemblies has been similar since the early releases, however it took a year for the quality of the Celera assembly to approach that of HGSC, suggesting an advantage of HGSC's hierarchical shotgun (HS) sequencing strategy over Celera's whole genome shotgun (WGS) approach. While similar numbers of ab initio predicted genes can be derived from both assemblies, Celera's Otto approach consistently generated larger, more varied gene sets than the Ensembl gene build system. The presence of a non-overlapping gene set has persisted with successive data releases from both groups. Since most of the unique genes from either genome assembly could be mapped back to the other assembly, we conclude that the gene set discrepancies do not reflect differences in local sequence content but rather in the assemblies and especially the different gene-prediction methodologies.
Search on Google
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Genome, Human
/
Sequence Alignment
/
Sequence Analysis, DNA
/
Gene Expression Profiling
/
Sequence Analysis, Protein
/
Databases, Protein
Type of study:
Diagnostic_studies
/
Evaluation_studies
/
Prognostic_studies
Limits:
Humans
Language:
En
Journal:
Bioinformatics
Journal subject:
INFORMATICA MEDICA
Year:
2003
Type:
Article
Affiliation country:
United States