Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Más filtros

Bases de datos
Tipo de estudio
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Proteomics ; 14(23-24): 2719-30, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25263569

RESUMEN

Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular subtyping of cancers, understanding cancer progression, and the discovery of novel biomarkers. The advances of genomics technologies (whole-genome exome, and transcript sequencing, collectively referred to as NGS (next-generation sequencing)) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome translated portion of aberrant genes using only genomic approaches. Combination of proteomic and genomic technologies are increasingly being employed. Various strategies have been employed to allow the usage of large-scale NGS data for conventional MS/MS searches. This paper provides a discussion of applying different strategies relating to large database search, and FDR (false discovery rate) -based error control, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any MS sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database that contained 2787062 novel splice junctions, 38,464 deletions, 1,105 insertions, and 182,302 substitutions. Proteomic data from a single ovarian carcinoma sample (439,858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65,578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and nonsample-recruited mutations, which emphasize the strength of our approach.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neoplasias/metabolismo , Proteómica/métodos , Bases de Datos de Proteínas , Humanos , Neoplasias/genética , Péptidos/genética
2.
J Proteome Res ; 13(1): 21-8, 2014 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-23802565

RESUMEN

The advent of inexpensive RNA-seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS-based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our paper addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads. Applying our method to cumulative C. elegans data reduced 496.2 GB of aligned RNA-seq SAM files to 410 MB of splice graph database written in FASTA format. This corresponds to 1000× compression of data size, without loss of sensitivity. We performed a proteogenomics study using the custom data set, using a completely automated pipeline, and identified a total of 4044 novel events, including 215 novel genes, 808 novel exons, 12 alternative splicings, 618 gene-boundary corrections, 245 exon-boundary changes, 938 frame shifts, 1166 reverse strands, and 42 translated UTRs. Our results highlight the usefulness of transcript + proteomic integration for improved genome annotations.


Asunto(s)
Caenorhabditis elegans/metabolismo , Bases de Datos Genéticas , Bases de Datos de Proteínas , Genoma , Proteoma , Análisis de Secuencia de ARN , Secuencia de Aminoácidos , Animales , Automatización , Caenorhabditis elegans/genética , Proteínas del Helminto/química , Proteínas del Helminto/genética , Proteínas del Helminto/metabolismo , Datos de Secuencia Molecular
3.
IEEE Trans Pattern Anal Mach Intell ; 38(4): 730-43, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26513777

RESUMEN

We present a real-time monocular visual odometry system that achieves high accuracy in real-world autonomous driving applications. First, we demonstrate robust monocular SFM that exploits multithreading to handle driving scenes with large motions and rapidly changing imagery. To correct for scale drift, we use known height of the camera from the ground plane. Our second contribution is a novel data-driven mechanism for cue combination that allows highly accurate ground plane estimation by adapting observation covariances of multiple cues, such as sparse feature matching and dense inter-frame stereo, based on their relative confidences inferred from visual data on a per-frame basis. Finally, we demonstrate extensive benchmark performance and comparisons on the challenging KITTI dataset, achieving accuracy comparable to stereo and exceeding prior monocular systems. Our SFM system is optimized to output pose within 50 ms in the worst case, while average case operation is over 30 fps. Our framework also significantly boosts the accuracy of applications like object localization that rely on the ground plane.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA