Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Proc Natl Acad Sci U S A ; 113(42): 11901-11906, 2016 10 18.
Article in English | MEDLINE | ID: mdl-27702888

ABSTRACT

We report on the sequencing of 10,545 human genomes at 30×-40× coverage with an emphasis on quality metrics and novel variant and sequence discovery. We find that 84% of an individual human genome can be sequenced confidently. This high-confidence region includes 91.5% of exon sequence and 95.2% of known pathogenic variant positions. We present the distribution of over 150 million single-nucleotide variants in the coding and noncoding genome. Each newly sequenced genome contributes an average of 8,579 novel variants. In addition, each genome carries on average 0.7 Mb of sequence that is not found in the main build of the hg38 reference genome. The density of this catalog of variation allowed us to construct high-resolution profiles that define genomic sites that are highly intolerant of genetic variation. These results indicate that the data generated by deep genome sequencing is of the quality necessary for clinical use.


Subject(s)
Genome, Human , Genomics , Whole Genome Sequencing , Chromosome Mapping , Computational Biology/methods , Databases, Nucleic Acid , Genetic Predisposition to Disease , Genetic Variation , Genomics/methods , Humans , Open Reading Frames , Polymorphism, Single Nucleotide , Reproducibility of Results , Untranslated Regions
2.
Int J Bioinform Res Appl ; 5(4): 417-31, 2009.
Article in English | MEDLINE | ID: mdl-19640829

ABSTRACT

High throughput gene expression data can be used to identify biomarker profiles for classification. The accuracy of microarray based sample classification depends on the algorithm employed for selecting the features (genes) used for classification, and the classification algorithm. We have evaluated the performance of over 2000 combinations of feature selection and classification algorithms in classifying cancer datasets. One of these combinations (SVM for ranking genes + SMO) shows excellent classification accuracy using a small number of genes across three cancer datasets tested. Notably, classification using 15 selected genes yields 96% accuracy for a dataset obtained on an independent microarray platform.


Subject(s)
Algorithms , Gene Expression Profiling/methods , Neoplasms/classification , Oligonucleotide Array Sequence Analysis/methods , Neoplasms/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...