Genome Annotation of a Model Diatom Phaeodactylum tricornutum Using an Integrated Proteogenomic Pipeline.
Mol Plant
; 11(10): 1292-1307, 2018 10 08.
Article
em En
| MEDLINE
| ID: mdl-30176371
Diatoms comprise a diverse and ecologically important group of eukaryotic phytoplankton that significantly contributes to marine primary production and global carbon cycling. Phaeodactylum tricornutum is commonly used as a model organism for studying diatom biology. Although its genome was sequenced in 2008, a high-quality genome annotation is still not available for this diatom. Here we report the development of an integrated proteogenomic pipeline and its application for improved annotation of P. tricornutum genome using mass spectrometry (MS)-based proteomics data. Our proteogenomic analysis unambiguously identified approximately 8300 genes and revealed 606 novel proteins, 506 revised genes, 94 splice variants, 58 single amino acid variants, and a holistic view of post-translational modifications in P. tricornutum. We experimentally confirmed a subset of novel events and obtained MS evidence for more than 200 micropeptides in P. tricornutum. These findings expand the genomic landscape of P. tricornutum and provide a rich resource for the study of diatom biology. The proteogenomic pipeline we developed in this study is applicable to any sequenced eukaryote and thus represents a significant contribution to the toolset for eukaryotic proteogenomic analysis. The pipeline and its source code are freely available at https://sourceforge.net/projects/gapeproteogenomic.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Genoma
/
Diatomáceas
/
Anotação de Sequência Molecular
Tipo de estudo:
Prognostic_studies
Idioma:
En
Revista:
Mol Plant
Ano de publicação:
2018
Tipo de documento:
Article