RESUMO
BACKGROUND: Bronchoscopy is a common procedure used for evaluation of suspicious lung nodules, but the low diagnostic sensitivity of bronchoscopy often results in inconclusive results and delays in treatment. Percepta Genomic Sequencing Classifier (GSC) was developed to assist with patient management in cases where bronchoscopy is inconclusive. Studies have shown that exposure to tobacco smoke alters gene expression in airway epithelial cells in a way that indicates an increased risk of developing lung cancer. Percepta GSC leverages this idea of a molecular "field of injury" from smoking and was developed using RNA sequencing data generated from lung bronchial brushings of the upper airway. A Percepta GSC score is calculated from an ensemble of machine learning algorithms utilizing clinical and genomic features and is used to refine a patient's risk stratification. METHODS: The objective of the analysis described and reported here is to validate the analytical performance of Percepta GSC. Analytical performance studies characterized the sensitivity of Percepta GSC test results to input RNA quantity, the potentially interfering agents of blood and genomic DNA, and the reproducibility of test results within and between processing runs and between laboratories. RESULTS: Varying the amount of input RNA into the assay across a nominal range had no significant impact on Percepta GSC classifier results. Bronchial brushing RNA contaminated with up to 10% genomic DNA by nucleic acid mass also showed no significant difference on classifier results. The addition of blood RNA, a potential contaminant in the bronchial brushing sample, caused no change to classifier results at up to 11% contamination by RNA proportion. Percepta GSC scores were reproducible between runs, within runs, and between laboratories, varying within less than 4% of the total score range (standard deviation of 0.169 for scores on 4.57 scale). CONCLUSIONS: The analytical sensitivity, analytical specificity, and reproducibility of Percepta GSC laboratory results were successfully demonstrated under conditions of expected day to day variation in testing. Percepta GSC test results are analytically robust and suitable for routine clinical use.
Assuntos
Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Nódulos Pulmonares Múltiplos/diagnóstico , Nódulos Pulmonares Múltiplos/genética , Biópsia , Tomada de Decisão Clínica , Biologia Computacional/métodos , Diagnóstico Diferencial , Gerenciamento Clínico , Perfilação da Expressão Gênica , Genômica/métodos , Humanos , Biópsia Líquida , Reprodutibilidade dos Testes , Medição de RiscoRESUMO
BACKGROUND: Clinical guidelines specify that diagnosis of interstitial pulmonary fibrosis (IPF) requires identification of usual interstitial pneumonia (UIP) pattern. While UIP can be identified by high resolution CT of the chest, the results are often inconclusive, making surgical lung biopsy necessary to reach a definitive diagnosis (Raghu et al., Am J Respir Crit Care Med 183(6):788-824, 2011). The Envisia genomic classifier differentiates UIP from non-UIP pathology in transbronchial biopsies (TBB), potentially allowing patients to avoid an invasive procedure (Brown et al., Am J Respir Crit Care Med 195:A6792, 2017). To ensure patient safety and efficacy, a laboratory developed test (LDT) must meet strict regulatory requirements for accuracy, reproducibility and robustness. The analytical characteristics of the Envisia test are assessed and reported here. METHODS: The Envisia test utilizes total RNA extracted from TBB samples to perform Next Generation RNA Sequencing. The gene count data from 190 genes are then input to the Envisia genomic classifier, a machine learning algorithm, to output either a UIP or non-UIP classification result. We characterized the stability of RNA in TBBs during collection and shipment, and evaluated input RNA mass and proportions on the limit of detection of UIP. We evaluated potentially interfering substances such as blood and genomic DNA. Intra-run, inter-run, and inter-laboratory reproducibility of test results were also characterized. RESULTS: RNA content within TBBs preserved in RNAprotect is stable for up to 14 days with no detectable change in RNA quality. The Envisia test is tolerant to variation in RNA input (5 to 30 ng), with no impact on classifier results. The Envisia test can tolerate dilution of non-UIP and UIP classification signals at the RNA level by up to 60% and 20%, respectively. Analytical specificity studies utilizing UIP and non-UIP samples mixed with genomic DNA (up to 30% relative input) demonstrated no impact to classifier results. The Envisia test tolerates up to 22% of blood contamination, well beyond the level observed in TBBs. The test is reproducible from RNA extraction through to Envisia test result (standard deviation of 0.20 for Envisia classification scores on > 7-unit scale). CONCLUSIONS: The Envisia test demonstrates the robust analytical performance required of an LDT. Envisia can be used to inform the diagnoses of patients with suspected IPF.
Assuntos
Perfilação da Expressão Gênica/métodos , Doenças Pulmonares Intersticiais/genética , Doenças Pulmonares Intersticiais/patologia , Pulmão/patologia , Análise de Sequência de RNA , Algoritmos , Biópsia , Broncoscopia , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Doenças Pulmonares Intersticiais/diagnóstico , Aprendizado de Máquina , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
BACKGROUND: The current standard practice of lung lesion diagnosis often leads to inconclusive results, requiring additional diagnostic follow up procedures that are invasive and often unnecessary due to the high benign rate in such lesions (Chest 143:e78S-e92, 2013). The Percepta bronchial genomic classifier was developed and clinically validated to provide more accurate classification of lung nodules and lesions that are inconclusive by bronchoscopy, using bronchial brushing specimens (N Engl J Med 373:243-51, 2015, BMC Med Genomics 8:18, 2015). The analytical performance of the Percepta test is reported here. METHODS: Analytical performance studies were designed to characterize the stability of RNA in bronchial brushing specimens during collection and shipment; analytical sensitivity defined as input RNA mass; analytical specificity (i.e. potentially interfering substances) as tested on blood and genomic DNA; and assay performance studies including intra-run, inter-run, and inter-laboratory reproducibility. RESULTS: RNA content within bronchial brushing specimens preserved in RNAprotect is stable for up to 20 days at 4 °C with no changes in RNA yield or integrity. Analytical sensitivity studies demonstrated tolerance to variation in RNA input (157 ng to 243 ng). Analytical specificity studies utilizing cancer positive and cancer negative samples mixed with either blood (up to 10 % input mass) or genomic DNA (up to 10 % input mass) demonstrated no assay interference. The test is reproducible from RNA extraction through to Percepta test result, including variation across operators, runs, reagent lots, and laboratories (standard deviation of 0.26 for scores on > 6 unit scale). CONCLUSIONS: Analytical sensitivity, analytical specificity and robustness of the Percepta test were successfully verified, supporting its suitability for clinical use.
Assuntos
Brônquios/metabolismo , Brônquios/patologia , Genômica , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Estudos de Casos e Controles , Genômica/métodos , Genômica/normas , Humanos , Reprodutibilidade dos Testes , Mucosa Respiratória/metabolismo , Mucosa Respiratória/patologia , Sensibilidade e EspecificidadeRESUMO
BACKGROUND: Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays. RESULTS: We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection. CONCLUSION: We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity.
Assuntos
Algoritmos , Dosagem de Genes/genética , Variação Genética/genética , Genômica/normas , Análise de Sequência com Séries de Oligonucleotídeos/normas , Validação de Programas de Computador , Adulto , Criança , Genoma Humano/genética , Genômica/métodos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodosRESUMO
MOTIVATION: The identification of signatures of positive selection can provide important insights into recent evolutionary history in human populations. Current methods mostly rely on allele frequency determination or focus on one or a small number of candidate chromosomal regions per study. With the availability of large-scale genotype data, efficient approaches for an unbiased whole genome scan are becoming necessary. METHODS: We have developed a new method, the whole genome long-range haplotype test (WGLRH), which uses genome-wide distributions to test for recent positive selection. Adapted from the long-range haplotype (LRH) test, the WGLRH test uses patterns of linkage disequilibrium (LD) to identify regions with extremely low historic recombination. Common haplotypes with significantly longer than expected ranges of LD given their frequencies are identified as putative signatures of recent positive selection. In addition, we have also determined the ancestral alleles of SNPs by genotyping chimpanzee and gorilla DNA, and have identified SNPs where the non-ancestral alleles have risen to extremely high frequencies in human populations, termed 'flipped SNPs'. Combining the haplotype test and the flipped SNPs determination, the WGLRH test serves as an unbiased genome-wide screen for regions under putative selection, and is potentially applicable to the study of other human populations. RESULTS: Using WGLRH and high-density oligonucleotide arrays interrogating 116 204 SNPs, we rapidly identified putative regions of positive selection in three populations (Asian, Caucasian, African-American), and extended these observations to a fourth population, Yoruba, with data obtained from the International HapMap consortium. We mapped significant regions to annotated genes. While some regions overlap with genes previously suggested to be under positive selection, many of the genes have not been previously implicated in natural selection and offer intriguing possibilities for further study. AVAILABILITY: the programs for the WGLRH algorithm are freely available and can be downloaded at http://www.affymetrix.com/support/supplement/WGLRH_program.zip.
Assuntos
Evolução Biológica , Mapeamento Cromossômico/métodos , Variação Genética/genética , Genética Populacional , Genoma Humano/genética , Impressão Genômica/genética , Seleção Genética , Algoritmos , Animais , Evolução Molecular , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , SoftwareRESUMO
Genetic studies aimed at understanding the molecular basis of complex human phenotypes require the genotyping of many thousands of single-nucleotide polymorphisms (SNPs) across large numbers of individuals. Public efforts have so far identified over two million common human SNPs; however, the scoring of these SNPs is labor-intensive and requires a substantial amount of automation. Here we describe a simple but effective approach, termed whole-genome sampling analysis (WGSA), for genotyping thousands of SNPs simultaneously in a complex DNA sample without locus-specific primers or automation. Our method amplifies highly reproducible fractions of the genome across multiple DNA samples and calls genotypes at >99% accuracy. We rapidly genotyped 14,548 SNPs in three different human populations and identified a subset of them with significant allele frequency differences between groups. We also determined the ancestral allele for 8,386 SNPs by genotyping chimpanzee and gorilla DNA. WGSA is highly scaleable and enables the creation of ultrahigh density SNP maps for use in genetic studies.
Assuntos
Algoritmos , DNA/química , DNA/genética , Perfilação da Expressão Gênica/métodos , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Frequência do Gene/genética , Genótipo , Humanos , Dados de Sequência Molecular , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Alinhamento de Sequência/métodos , Homologia de Sequência do Ácido NucleicoRESUMO
The Collaborative Study on the Genetics of Alcoholism (COGA) is a large-scale family study designed to identify genes that affect the risk for alcoholism and alcohol-related phenotypes. We performed genome-wide linkage analyses on the COGA data made available to participants in the Genetic Analysis Workshop 14 (GAW 14). The dataset comprised 1,350 participants from 143 families. The samples were analyzed on three technologies: microsatellites spaced at 10 cM, Affymetrix GeneChip Human Mapping 10 K Array (HMA10K) and Illumina SNP-based Linkage III Panel. We used ALDX1 and ALDX2, the COGA definitions of alcohol dependence, as well as electrophysiological measures TTTH1 and ECB21 to detect alcoholism susceptibility loci. Many chromosomal regions were found to be significant for each of the phenotypes at a p-value of 0.05. The most significant region for ALDX1 is on chromosome 7, with a maximum LOD score of 2.25 for Affymetrix SNPs, 1.97 for Illumina SNPs, and 1.72 for microsatellites. The same regions on chromosome 7 (96-106 cM) and 10 (149-176 cM) were found to be significant for both ALDX1 and ALDX2. A region on chromosome 7 (112-153 cM) and a region on chromosome 6 (169-185 cM) were identified as the most significant regions for TTTH1 and ECB21, respectively. We also performed linkage analysis on denser maps of markers by combining the SNPs datasets from Affymetrix and Illumina. Adding the microsatellite data to the combined SNP dataset improved the results only marginally. The results indicated that SNPs outperform microsatellites with the densest marker sets performing the best.
Assuntos
Alcoolismo/genética , Alcoolismo/fisiopatologia , Mapeamento Cromossômico , Eletroencefalografia , Estudo de Associação Genômica Ampla , Repetições de Microssatélites/genética , Polimorfismo de Nucleotídeo Único/genética , Cromossomos Humanos Par 7/genética , Humanos , FenótipoRESUMO
The data provided to the Genetic Analysis Workshop 14 (GAW 14) was the result of a collaboration among several different groups, catalyzed by Elizabeth Pugh from The Center for Inherited Disease Research (CIDR) and the organizers of GAW 14, Jean MacCluer and Laura Almasy. The DNA, phenotypic characterization, and microsatellite genomic survey were provided by the Collaborative Study on the Genetics of Alcoholism (COGA), a nine-site national collaboration funded by the National Institute of Alcohol and Alcoholism (NIAAA) and the National Institute of Drug Abuse (NIDA) with the overarching goal of identifying and characterizing genes that affect the susceptibility to develop alcohol dependence and related phenotypes. CIDR, Affymetrix, and Illumina provided single-nucleotide polymorphism genotyping of a large subset of the COGA subjects. This article briefly describes the dataset that was provided.
Assuntos
Alcoolismo/genética , Congressos como Assunto , Comportamento Cooperativo , Bases de Dados Genéticas , Polimorfismo de Nucleotídeo Único/genética , Genótipo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Controle de QualidadeRESUMO
We report a method, Expression-Microarray Copy Number Analysis (ECNA) for the detection of copy number changes using Affymetrix Human Genome U133 Plus 2.0 arrays, starting with as little as 5 ng input genomic DNA. An analytical approach was developed using DNA isolated from cell lines containing various X-chromosome numbers, and validated with DNA from cell lines with defined deletions and amplifications in other chromosomal locations. We applied this method to examine the copy number changes in DNA from 5 frozen gastrointestinal stromal tumors (GIST). We detected known copy number aberrations consistent with previously published results using conventional or BAC-array CGH, as well as novel changes in GIST tumors. These changes were concordant with results from Affymetrix 100K human SNP mapping arrays. Gene expression data for these GIST samples had previously been generated on U133A arrays, allowing us to explore correlations between chromosomal copy number and RNA expression levels. One of the novel aberrations identified in the GIST samples, a previously unreported gain on 1q21.1 containing the PEX11B gene, was confirmed in this study by FISH and was also shown to have significant differences in expression pattern when compared to a control sample. In summary, we have demonstrated the use of gene expression microarrays for the detection of genomic copy number aberrations in tumor samples. This method may be used to study copy number changes in other species for which RNA expression arrays are available, e.g. other mammals, plants, etc., and for which SNPs have not yet been mapped.
RESUMO
Composite organic-inorganic nanoparticles (COINs) are novel optical labels for detection of biomolecules. We have previously developed methods to encapsulate COINs and to functionalize them with antibodies. Here we report the first steps toward application of COINs to the detection of proteins in human tissues. Two analytes, PSA and CK18, are detected simultaneously using two different COINs in a direct binding assay, and two different COINs are shown to simultaneously label PSA in tissue samples.
Assuntos
Nanopartículas/química , Anticorpos , Ensaio de Imunoadsorção Enzimática , Histocitoquímica/métodos , Humanos , Queratina-18/análise , Masculino , Nanotecnologia/métodos , Próstata/química , Antígeno Prostático Específico/análise , Ligação Proteica , Análise Espectral RamanRESUMO
The cause of mental retardation in one-third to one-half of all affected individuals is unknown. Microscopically detectable chromosomal abnormalities are the most frequently recognized cause, but gain or loss of chromosomal segments that are too small to be seen by conventional cytogenetic analysis has been found to be another important cause. Array-based methods offer a practical means of performing a high-resolution survey of the entire genome for submicroscopic copy-number variants. We studied 100 children with idiopathic mental retardation and normal results of standard chromosomal analysis, by use of whole-genome sampling analysis with Affymetrix GeneChip Human Mapping 100K arrays. We found de novo deletions as small as 178 kb in eight cases, de novo duplications as small as 1.1 Mb in two cases, and unsuspected mosaic trisomy 9 in another case. This technology can detect at least twice as many potentially pathogenic de novo copy-number variants as conventional cytogenetic analysis can in people with mental retardation.
Assuntos
Aberrações Cromossômicas , Deficiência Intelectual/diagnóstico , Análise de Sequência com Séries de Oligonucleotídeos , Criança , Dosagem de Genes , Genoma Humano , Humanos , Deleção de SequênciaRESUMO
Mutation of the human genome ranges from single base-pair changes to whole-chromosome aneuploidy. Karyotyping, fluorescence in situ hybridization, and comparative genome hybridization are currently used to detect chromosome abnormalities of clinical significance. These methods, although powerful, suffer from limitations in speed, ease of use, and resolution, and they do not detect copy-neutral chromosomal aberrations--for example, uniparental disomy (UPD). We have developed a high-throughput approach for assessment of DNA copy-number changes, through use of high-density synthetic oligonucleotide arrays containing 116,204 single-nucleotide polymorphisms, spaced at an average distance of 23.6 kb across the genome. Using this approach, we analyzed samples that failed conventional karyotypic analysis, and we detected amplifications and deletions across a wide range of sizes (1.3-145.9 Mb), identified chromosomes containing anonymous chromatin, and used genotype data to determine the molecular origin of two cases of UPD. Furthermore, our data provided independent confirmation for a case that had been misinterpreted by karyotype analysis. The high resolution of our approach provides more-precise breakpoint mapping, which allows subtle phenotypic heterogeneity to be distinguished at a molecular level. The accurate genotype information provided on these arrays enables the identification of copy-neutral loss-of-heterozygosity events, and the minimal requirement of DNA (250 ng per array) allows rapid analysis of samples without the need for cell culture. This technology overcomes many limitations currently encountered in routine clinical diagnostic laboratories tasked with accurate and rapid diagnosis of chromosomal abnormalities.
Assuntos
Aberrações Cromossômicas , Mapeamento Cromossômico , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único/genética , Cromossomos Humanos , DNA/análise , HumanosRESUMO
Despite the theoretical evidence of the utility of single-nucleotide polymorphisms (SNPs) for linkage analysis, no whole-genome scans of a complex disease have yet been published to directly compare SNPs with microsatellites. Here, we describe a whole-genome screen of 157 families with multiple cases of rheumatoid arthritis (RA), performed using 11,245 genomewide SNPs. The results were compared with those from a 10-cM microsatellite scan in the same cohort. The SNP analysis detected HLA*DRB1, the major RA susceptibility locus (P=.00004), with a linkage interval of 31 cM, compared with a 50-cM linkage interval detected by the microsatellite scan. In addition, four loci were detected at a nominal significance level (P<.05) in the SNP linkage analysis; these were not observed in the microsatellite scan. We demonstrate that variation in information content was the main factor contributing to observed differences in the two scans, with the SNPs providing significantly higher information content than the microsatellites. Reducing the number of SNPs in the marker set to 3,300 (1-cM spacing) caused several loci to drop below nominal significance levels, suggesting that decreases in information content can have significant effects on linkage results. In contrast, differences in maps employed in the analysis, the low detectable rate of genotyping error, and the presence of moderate linkage disequilibrium between markers did not significantly affect the results. We have demonstrated the utility of a dense SNP map for performing linkage analysis in a late-age-at-onset disease, where DNA from parents is not always available. The high SNP density allows loci to be defined more precisely and provides a partial scaffold for association studies, substantially reducing the resource requirement for gene-mapping studies.