RESUMO
[This corrects the article DOI: 10.1371/journal.pgen.1000832.].
RESUMO
BACKGROUND: Despite recent advances in the investigation of myeloproliferative neoplasms (MPN), the impact of genetic heterogeneity on its molecular pathogenesis has not been fully elucidated. Thus, in this study, we aim to characterize the genetic complexity in Korean patients with polycythemia vera (PV) and essential thrombocythemia (ET). METHODS: We conducted association studies using 84 single-nucleotide polymorphisms (SNPs) in 229 patients (96 with PV and 133 with ET) and 170 controls. Further, whole-genome sequencing was performed in six patients (two with JAK2 V617F and four with wild-type JAK2), and putative somatic mutations were validated in a further 69 ET patients. Clinical and laboratory characteristics were also analyzed. RESULTS: Several germline SNPs and the 46 haplotype were significantly associated with PV and ET. Three somatic mutations in MPDZ, IQCH, and CALR genes were selected and validated. The frequency of the CALR mutation was 58.0% (40/69) in ET patients, who did not carry JAK2/MPL mutations. Moreover, compared with JAK2 V617F-positive patients, those with CALR mutations showed lower hemoglobin and hematocrit levels (P = 0.004 and P = 0.002, respectively), higher platelet counts (P =0.008), and a lower frequency of cytoreductive therapy (P = 0.014). CONCLUSION: This study was the first comprehensive investigation of the genetic characteristics of Korean patients with PV and ET. We found that somatic mutations and the 46 haplotype contribute to PV and ET pathogenesis in Korean patients.
Assuntos
Predisposição Genética para Doença/genética , Janus Quinase 2/genética , Policitemia Vera/genética , Polimorfismo de Nucleotídeo Único/genética , Receptores de Trombopoetina/genética , Trombocitemia Essencial/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Proteínas de Transporte/genética , Análise Mutacional de DNA , Feminino , Frequência do Gene , Estudos de Associação Genética , Genótipo , Humanos , Masculino , Proteínas de Membrana , Pessoa de Meia-Idade , Policitemia Vera/epidemiologia , República da Coreia/epidemiologia , Estatísticas não Paramétricas , Trombocitemia Essencial/epidemiologia , Adulto JovemRESUMO
The study of reaction-diffusion processes is much more complicated on general curved surfaces than on standard Cartesian coordinate spaces. Here we show how to formulate and solve systems of reaction-diffusion equations on surfaces in an extremely simple way, using only the standard Cartesian form of differential operators, and a discrete unorganized point set to represent the surface. Our method decouples surface geometry from the underlying differential operators. As a consequence, it becomes possible to formulate and solve rather general reaction-diffusion equations on general surfaces without having to consider the complexities of differential geometry or sophisticated numerical analysis. To illustrate the generality of the method, computations for surface diffusion, pattern formation, excitable media, and bulk-surface coupling are provided for a variety of complex point cloud surfaces.
Assuntos
Algoritmos , Fenômenos Químicos , Matemática/métodos , Modelos Teóricos , DifusãoRESUMO
Cytosine DNA methylation is important in regulating gene expression and in silencing transposons and other repetitive sequences. Recent genomic studies in Arabidopsis thaliana have revealed that many endogenous genes are methylated either within their promoters or within their transcribed regions, and that gene methylation is highly correlated with transcription levels. However, plants have different types of methylation controlled by different genetic pathways, and detailed information on the methylation status of each cytosine in any given genome is lacking. To this end, we generated a map at single-base-pair resolution of methylated cytosines for Arabidopsis, by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing using the Illumina 1G Genome Analyser and Solexa sequencing technology. This approach, termed BS-Seq, unlike previous microarray-based methods, allows one to sensitively measure cytosine methylation on a genome-wide scale within specific sequence contexts. Here we describe methylation on previously inaccessible components of the genome and analyse the DNA methylation sequence composition and distribution. We also describe the effect of various DNA methylation mutants on genome-wide methylation patterns, and demonstrate that our newly developed library construction and computational methods can be applied to large genomes such as that of mouse.
Assuntos
Arabidopsis/genética , Metilação de DNA , Genoma de Planta/genética , Análise de Sequência de DNA/métodos , Sulfitos/metabolismo , 5-Metilcitosina/metabolismo , Animais , Sequência de Bases , Biologia Computacional , Citosina/metabolismo , Regulação da Expressão Gênica de Plantas/genética , Biblioteca Gênica , Camundongos , Mutação/genética , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Uracila/metabolismoRESUMO
We delineated a syndromic recessive preaxial brachydactyly with partial duplication of proximal phalanges to 16.8 Mb over 4 chromosomes. High-throughput sequencing of all 177 candidate genes detected a truncating frameshift mutation in the gene CHSY1 encoding a chondroitin synthase with a Fringe domain. CHSY1 was secreted from patients' fibroblasts and was required for synthesis of chondroitin sulfate moieties. Noticeably, its absence triggered massive production of JAG1 and subsequent NOTCH activation, which could only be reversed with a wild-type but not a Fringe catalytically dead CHSY1 construct. In vitro, depletion of CHSY1 by RNAi knockdown resulted in enhanced osteogenesis in fetal osteoblasts and remarkable upregulation of JAG2 in glioblastoma cells. In vivo, chsy1 knockdown in zebrafish embryos partially phenocopied the human disorder; it increased NOTCH output and impaired skeletal, pectoral-fin, and retinal development. We conclude that CHSY1 is a secreted FRINGE enzyme required for adjustment of NOTCH signaling throughout human and fish embryogenesis and particularly during limb patterning.
Assuntos
Deformidades Congênitas do Pé/genética , Deformidades Congênitas da Mão/genética , N-Acetilgalactosaminiltransferases/genética , Receptores Notch/metabolismo , Transdução de Sinais , Sequência de Aminoácidos , Células Cultivadas , Feminino , Mutação da Fase de Leitura , Genótipo , Humanos , Masculino , Dados de Sequência Molecular , N-Acetilgalactosaminiltransferases/química , Linhagem , Reação em Cadeia da Polimerase , Interferência de RNA , Homologia de Sequência de Aminoácidos , SíndromeRESUMO
BACKGROUND: Establishing the genetic basis of phenotypes such as skeletal dysplasia in model organisms can provide insights into biologic processes and their role in human disease. METHODS: We screened mutagenized mice and observed a neonatal lethal skeletal dysplasia with an autosomal recessive pattern of inheritance. Through genetic mapping and positional cloning, we identified the causative mutation. RESULTS: Affected mice had a nonsense mutation in the thyroid hormone receptor interactor 11 gene (Trip11), which encodes the Golgi microtubule-associated protein 210 (GMAP-210); the affected mice lacked this protein. Golgi architecture was disturbed in multiple tissues, including cartilage. Skeletal development was severely impaired, with chondrocytes showing swelling and stress in the endoplasmic reticulum, abnormal cellular differentiation, and increased cell death. Golgi-mediated glycosylation events were altered in fibroblasts and chondrocytes lacking GMAP-210, and these chondrocytes had intracellular accumulation of perlecan, an extracellular matrix protein, but not of type II collagen or aggrecan, two other extracellular matrix proteins. The similarities between the skeletal and cellular phenotypes in these mice and those in patients with achondrogenesis type 1A, a neonatal lethal form of skeletal dysplasia in humans, suggested that achondrogenesis type 1A may be caused by GMAP-210 deficiency. Sequence analysis revealed loss-of-function mutations in the 10 unrelated patients with achondrogenesis type 1A whom we studied. CONCLUSIONS: GMAP-210 is required for the efficient glycosylation and cellular transport of multiple proteins. The identification of a mutation affecting GMAP-210 in mice, and then in humans, as the cause of a lethal skeletal dysplasia underscores the value of screening for abnormal phenotypes in model organisms and identifying the causative mutations.
Assuntos
Condrócitos/citologia , Códon sem Sentido , Proteínas Nucleares/genética , Osteocondrodisplasias/genética , Animais , Diferenciação Celular , Proliferação de Células , Proteínas do Citoesqueleto , Retículo Endoplasmático/ultraestrutura , Genes Recessivos , Glicosilação , Complexo de Golgi/ultraestrutura , Humanos , Camundongos , Camundongos Mutantes , Proteínas Nucleares/deficiência , Fenótipo , Polimorfismo de Nucleotídeo Único , Processamento de Proteína Pós-Traducional/fisiologia , Análise de Sequência de DNARESUMO
OBJECTIVE: Myoclonus is characterized by sudden, brief involuntary movements, and its presence is debilitating. We identified a family suffering from adult onset, cortical myoclonus without associated seizures. We performed clinical, electrophysiological, and genetic studies to define this phenotype. METHODS: A large, 4-generation family with a history of myoclonus underwent careful questioning, examination, and electrophysiological testing. Thirty-five family members donated blood samples for genetic analysis, which included single nucleotide polymorphism mapping, microsatellite linkage, targeted massively parallel sequencing, and Sanger sequencing. In silico and in vitro experiments were performed to investigate functional significance of the mutation. RESULTS: We identified 11 members of a Canadian Mennonite family suffering from adult onset, slowly progressive, disabling, multifocal myoclonus. Somatosensory evoked potentials indicated a cortical origin of the myoclonus. There were no associated seizures. Some severely affected individuals developed signs of progressive cerebellar ataxia of variable severity late in the course of their illness. The phenotype was inherited in an autosomal dominant fashion. We demonstrated linkage to chromosome 16q21-22.1. We then sequenced all coding sequence in the critical region, identifying only a single cosegregating, novel, nonsynonymous mutation, which resides in the gene NOL3. Furthermore, this mutation was found to alter post-translational modification of NOL3 protein in vitro. INTERPRETATION: We propose that familial cortical myoclonus is a novel movement disorder that may be caused by mutation in NOL3. Further investigation of the role of NOL3 in neuronal physiology may shed light on neuronal membrane hyperexcitability and pathophysiology of myoclonus and related disorders.
Assuntos
Proteínas Reguladoras de Apoptose/genética , Saúde da Família , Predisposição Genética para Doença/genética , Proteínas Musculares/genética , Mutação/genética , Mioclonia/genética , Adolescente , Adulto , Idade de Início , Animais , Canadá , Linhagem Celular Transformada , Mapeamento Cromossômico , Cromossomos Humanos Par 16 , Eletroencefalografia , Feminino , Ácido Glutâmico/genética , Humanos , Masculino , Camundongos , Pessoa de Meia-Idade , Mioclonia/diagnóstico , Fenótipo , Prolina/genética , TransfecçãoRESUMO
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30x genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.
Assuntos
Linhagem Celular Tumoral/química , Genoma Humano , Glioma/genética , Linhagem Celular Tumoral/citologia , Genótipo , Humanos , Dados de Sequência Molecular , Mutação , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Análise de Sequência de DNARESUMO
Analysis of a nuclear family with three affected offspring identified an autosomal-recessive form of spondyloepimetaphyseal dysplasia characterized by severe short stature and a unique constellation of radiographic findings. Homozygosity for a haplotype that was identical by descent between two of the affected individuals identified a locus for the disease gene within a 17.4 Mb interval on chromosome 15, a region containing 296 genes. These genes were assessed and ranked by cartilage selectivity with whole-genome microarray data, revealing only two genes, encoding aggrecan and chondroitin sulfate proteoglycan 4, that were selectively expressed in cartilage. Sequence analysis of aggrecan complementary DNA from an affected individual revealed homozygosity for a missense mutation (c.6799G --> A) that predicts a p.D2267N amino acid substitution in the C-type lectin domain within the G3 domain of aggrecan. The D2267 residue is predicted to coordinate binding of a calcium ion, which influences the conformational binding loops of the C-type lectin domain that mediate interactions with tenascins and other extracellular-matrix proteins. Expression of the normal and mutant G3 domains in mammalian cells showed that the mutation created a functional N-glycosylation site but did not adversely affect protein trafficking and secretion. Surface-plasmon-resonance studies showed that the mutation influenced the binding and kinetics of the interactions between the aggrecan G3 domain and tenascin-C. These findings identify an autosomal-recessive skeletal dysplasia and a significant role for the aggrecan C-type lectin domain in regulating endochondral ossification and, thereby, height.
Assuntos
Agrecanas/genética , Antígenos/genética , Predisposição Genética para Doença , Lectinas Tipo C/genética , Mutação de Sentido Incorreto , Osteocondrodisplasias/genética , Proteoglicanas/genética , Adolescente , Adulto , Agrecanas/metabolismo , Sequência de Aminoácidos , Antígenos/metabolismo , Cartilagem/metabolismo , Linhagem Celular , Criança , Feminino , Humanos , Lectinas Tipo C/metabolismo , Masculino , Dados de Sequência Molecular , Osteocondrodisplasias/metabolismo , Linhagem , Ligação Proteica , Estrutura Terciária de Proteína , Proteoglicanas/metabolismo , Tenascina/metabolismo , Adulto JovemRESUMO
The short-rib polydactyly (SRP) syndromes are a heterogeneous group of perinatal lethal skeletal disorders with polydactyly and multisystem organ abnormalities. Homozygosity by descent mapping in a consanguineous SRP family identified a genomic region that contained DYNC2H1, a cytoplasmic dynein involved in retrograde transport in the cilium. Affected individuals in the family were homozygous for an exon 12 missense mutation that predicted the amino acid substitution R587C. Compound heterozygosity for one missense and one null mutation was identified in two additional nonconsanguineous SRP families. Cultured chondrocytes from affected individuals showed morphologically abnormal, shortened cilia. In addition, the chondrocytes showed abnormal cytoskeletal microtubule architecture, implicating an altered microtubule network as part of the disease process. These findings establish SRP as a cilia disorder and demonstrate that DYNC2H1 is essential for skeletogenesis and growth.
Assuntos
Cílios/patologia , Dineínas/genética , Mutação , Síndrome de Costela Curta e Polidactilia/genética , Sequência de Bases , Células Cultivadas , Condrócitos/patologia , Códon sem Sentido , Consanguinidade , Dineínas do Citoplasma , Primers do DNA/genética , Dineínas/fisiologia , Feminino , Homozigoto , Humanos , Recém-Nascido , Masculino , Mutação de Sentido Incorreto , Linhagem , Gravidez , Radiografia , Síndrome de Costela Curta e Polidactilia/diagnóstico por imagem , Síndrome de Costela Curta e Polidactilia/embriologiaRESUMO
BACKGROUND: The optimal time for the initiation of antiretroviral therapy for asymptomatic patients with human immunodeficiency virus (HIV) infection is uncertain. METHODS: We conducted two parallel analyses involving a total of 17,517 asymptomatic patients with HIV infection in the United States and Canada who received medical care during the period from 1996 through 2005. None of the patients had undergone previous antiretroviral therapy. In each group, we stratified the patients according to the CD4+ count (351 to 500 cells per cubic millimeter or >500 cells per cubic millimeter) at the initiation of antiretroviral therapy. In each group, we compared the relative risk of death for patients who initiated therapy when the CD4+ count was above each of the two thresholds of interest (early-therapy group) with that of patients who deferred therapy until the CD4+ count fell below these thresholds (deferred-therapy group). RESULTS: In the first analysis, which involved 8362 patients, 2084 (25%) initiated therapy at a CD4+ count of 351 to 500 cells per cubic millimeter, and 6278 (75%) deferred therapy. After adjustment for calendar year, cohort of patients, and demographic and clinical characteristics, among patients in the deferred-therapy group there was an increase in the risk of death of 69%, as compared with that in the early-therapy group (relative risk in the deferred-therapy group, 1.69; 95% confidence interval [CI], 1.26 to 2.26; P<0.001). In the second analysis involving 9155 patients, 2220 (24%) initiated therapy at a CD4+ count of more than 500 cells per cubic millimeter and 6935 (76%) deferred therapy. Among patients in the deferred-therapy group, there was an increase in the risk of death of 94% (relative risk, 1.94; 95% CI, 1.37 to 2.79; P<0.001). CONCLUSIONS: The early initiation of antiretroviral therapy before the CD4+ count fell below two prespecified thresholds significantly improved survival, as compared with deferred therapy.
Assuntos
Antirretrovirais/administração & dosagem , Contagem de Linfócito CD4 , Infecções por HIV/tratamento farmacológico , Adulto , Fatores de Confusão Epidemiológicos , Esquema de Medicação , Feminino , HIV/genética , HIV/imunologia , HIV/isolamento & purificação , Infecções por HIV/imunologia , Infecções por HIV/mortalidade , Humanos , Masculino , Pessoa de Meia-Idade , Modelos de Riscos Proporcionais , RNA Viral/análise , Risco , Análise de SobrevidaRESUMO
In order for next-generation sequencing to become widely used as a diagnostic in the healthcare industry, sequencing instrumentation will need to be mass produced with a high degree of quality and economy. One way to achieve this is to recast DNA sequencing in a format that fully leverages the manufacturing base created for computer chips, complementary metal-oxide semiconductor chip fabrication, which is the current pinnacle of large scale, high quality, low-cost manufacturing of high technology. To achieve this, ideally the entire sensory apparatus of the sequencer would be embodied in a standard semiconductor chip, manufactured in the same fab facilities used for logic and memory chips. Recently, such a sequencing chip, and the associated sequencing platform, has been developed and commercialized by Ion Torrent, a division of Life Technologies, Inc. Here we provide an overview of this semiconductor chip based sequencing technology, and summarize the progress made since its commercial introduction. We described in detail the progress in chip scaling, sequencing throughput, read length, and accuracy. We also summarize the enhancements in the associated platform, including sample preparation, data processing, and engagement of the broader development community through open source and crowdsourcing initiatives.
Assuntos
Semicondutores , Análise de Sequência de DNA/métodos , DNA/análise , DNA/química , Técnicas Eletroquímicas/instrumentação , Técnicas Eletroquímicas/métodos , Desenho de Equipamento , Humanos , Análise de Sequência de DNA/instrumentaçãoRESUMO
This work reports the first CMOS molecular electronics chip. It is configured as a biosensor, where the primary sensing element is a single molecule "molecular wire" consisting of a â¼100 GΩ, 25 nm long alpha-helical peptide integrated into a current monitoring circuit. The engineered peptide contains a central conjugation site for attachment of various probe molecules, such as DNA, proteins, enzymes, or antibodies, which program the biosensor to detect interactions with a specific target molecule. The current through the molecular wire under a dc applied voltage is monitored with millisecond temporal resolution. The detected signals are millisecond-scale, picoampere current pulses generated by each transient probe-target molecular interaction. Implemented in a 0.18 µm CMOS technology, 16k sensors are arrayed with a 20 µm pitch and read out at a 1 kHz frame rate. The resulting biosensor chip provides direct, real-time observation of the single-molecule interaction kinetics, unlike classical biosensors that measure ensemble averages of such events. This molecular electronics chip provides a platform for putting molecular biosensing "on-chip" to bring the power of semiconductor chips to diverse applications in biological research, diagnostics, sequencing, proteomics, drug discovery, and environmental monitoring.
Assuntos
Técnicas Biossensoriais , Eletrônica , Análise de Sequência com Séries de Oligonucleotídeos , Semicondutores , DNA/química , Nanotecnologia , Técnicas Biossensoriais/métodosRESUMO
Oral-facial-digital (OFD) syndromes are a heterogeneous group of congenital disorders characterized by malformations of the face and oral cavity, and digit anomalies. Mutations within 12 cilia-related genes have been identified that cause several types of OFD, suggesting that OFDs constitute a subgroup of developmental ciliopathies. Through homozygosity mapping and exome sequencing of two families with variable OFD type 2, we identified distinct germline variants in INTS13, a subunit of the Integrator complex. This multiprotein complex associates with RNA Polymerase II and cleaves nascent RNA to modulate gene expression. We determined that INTS13 utilizes its C-terminus to bind the Integrator cleavage module, which is disrupted by the identified germline variants p.S652L and p.K668Nfs*9. Depletion of INTS13 disrupts ciliogenesis in human cultured cells and causes dysregulation of a broad collection of ciliary genes. Accordingly, its knockdown in Xenopus embryos leads to motile cilia anomalies. Altogether, we show that mutations in INTS13 cause an autosomal recessive ciliopathy, which reveals key interactions between components of the Integrator complex.
Assuntos
Proteínas de Transporte/genética , Proteínas de Ciclo Celular/genética , Ciliopatias , Síndromes Orofaciodigitais , Cílios/genética , Ciliopatias/genética , Homozigoto , Humanos , Mutação , Síndromes Orofaciodigitais/genética , RNA , RNA Polimerase II/genéticaRESUMO
BACKGROUND: DNA sequence comparison is a well-studied problem, in which two DNA sequences are compared using a weighted edit distance. Recent DNA sequencing technologies however observe an encoded form of the sequence, rather than each DNA base individually. The encoded DNA sequence may contain technical errors, and therefore encoded sequencing errors must be incorporated when comparing an encoded DNA sequence to a reference DNA sequence. RESULTS: Although two-base encoding is currently used in practice, many other encoding schemes are possible, whereby two ore more bases are encoded at a time. A generalized k-base encoding scheme is presented, whereby feasible higher order encodings are better able to differentiate errors in the encoded sequence from true DNA sequence variants. A generalized version of the previous two-base encoding DNA sequence comparison algorithm is used to compare a k-base encoded sequence to a DNA reference sequence. Finally, simulations are performed to evaluate the power, the false positive and false negative SNP discovery rates, and the performance time of k-base encoding compared to previous methods as well as to the standard DNA sequence comparison algorithm. CONCLUSIONS: The novel generalized k-base encoding scheme and resulting local alignment algorithm permits the development of higher fidelity ligation-based next generation sequencing technology. This bioinformatic solution affords greater robustness to errors, as well as lower false SNP discovery rates, only at the cost of computational time.
Assuntos
Algoritmos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Simulação por Computador , DNA/genéticaRESUMO
BACKGROUND: Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. RESULTS: In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). CONCLUSIONS: The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data interface to simplify development of analytical tools. The range of data types supported, the ease of querying and integrating with existing tools, and the robust scalability of the underlying cloud-based technologies make SeqWare Query Engine a nature fit for storing and searching ever-growing genome sequence datasets.
Assuntos
Genômica/métodos , Software , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA/métodosRESUMO
BACKGROUND: DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. RESULTS: We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. CONCLUSION: The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data.
Assuntos
Algoritmos , DNA/química , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sequência de BasesRESUMO
BACKGROUND: The emergence of next-generation sequencing technology presents tremendous opportunities to accelerate the discovery of rare variants or mutations that underlie human genetic disorders. Although the complete sequencing of the affected individuals' genomes would be the most powerful approach to finding such variants, the cost of such efforts make it impractical for routine use in disease gene research. In cases where candidate genes or loci can be defined by linkage, association, or phenotypic studies, the practical sequencing target can be made much smaller than the whole genome, and it becomes critical to have capture methods that can be used to purify the desired portion of the genome for shotgun short-read sequencing without biasing allelic representation or coverage. One major approach is array-based capture which relies on the ability to create a custom in-situ synthesized oligonucleotide microarray for use as a collection of hybridization capture probes. This approach is being used by our group and others routinely and we are continuing to improve its performance. RESULTS: Here, we provide a complete protocol optimized for large aggregate sequence intervals and demonstrate its utility with the capture of all predicted amino acid coding sequence from 3,038 human genes using 241,700 60-mer oligonucleotides. Further, we demonstrate two techniques by which the efficiency of the capture can be increased: by introducing a step to block cross hybridization mediated by common adapter sequences used in sequencing library construction, and by repeating the hybridization capture step. These improvements can boost the targeting efficiency to the point where over 85% of the mapped sequence reads fall within 100 bases of the targeted regions. CONCLUSIONS: The complete protocol introduced in this paper enables researchers to perform practical capture experiments, and includes two novel methods for increasing the targeting efficiency. Coupled with the new massively parallel sequencing technologies, this provides a powerful approach to identifying disease-causing genetic variants that can be localized within the genome by traditional methods.
Assuntos
Loci Gênicos , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , DNA de Neoplasias/genética , Genes Neoplásicos , Biblioteca Genômica , Humanos , Alinhamento de SequênciaRESUMO
BACKGROUND: Lower levels of quality asthma care among racially diverse populations might be due to inaccurate disease status assessments. The Asthma Control and Communication Instrument (ACCI) is a new tool that captures patient report of disease status during routine care. OBJECTIVE: We sought to test the ACCI's psychometric properties in a racially diverse population. METHODS: We performed a cross-sectional study. Subjects were recruited from specialist and generalist urban outpatient clinics. The ACCI and measures of asthma control, quality of life, lung function, and specialist rating of asthma status were collected. Four ACCI domains were separately validated: Acute Care, Bother, Control, and Direction. Principal component analysis, internal consistency, concurrent, discriminative, known-groups validity, and accuracy were evaluated. RESULTS: Two hundred seventy asthmatic patients (77% female subjects, 55% black) participated. ACCI Control domain internal consistency was 0.80. ACCI Bother, Control, and Direction domains showed strong concurrent validity with asthma control and quality-of-life measures (all P < .001). ACCI Acute Care and Direction domains showed strong concurrent validity with individual validation items (all P < .001). The ACCI Control domain discriminated clinically important levels of disease status measured by asthma control, quality of life (both P < .001), and percent predicted peak expiratory flow rate (P = .005) and was associated with specialist rating of disease status (P < .001), confirming known-groups validity. The accuracy of the ACCI Control domain in classifying patients with uncontrolled asthma was very good (area under the curve, 0.851; 95% CI, 0.742-0.95870). Results were similar for both black and white subjects. CONCLUSION: The ACCI is a promising clinical tool that measures asthma disease status during routine health care and is valid for use in both black and white populations.
Assuntos
Asma/diagnóstico , Asma/terapia , Indicadores Básicos de Saúde , Disparidades em Assistência à Saúde , Qualidade da Assistência à Saúde , População Negra , Estudos Transversais , Feminino , Humanos , Masculino , Pacientes Ambulatoriais , Psicometria , Qualidade de Vida , Testes de Função Respiratória , População Urbana , População BrancaRESUMO
RATIONALE AND OBJECTIVES: To assess the pretest practices of US clinicians who treat patients with acute pulmonary embolism (PE). MATERIALS AND METHODS: We surveyed 855 practicing physicians selected randomly from three professional organizations. We asked participants to estimate how often and by what method they determine the likelihood of PE before they request confirmatory studies. Participants reported their awareness of four published clinical practice guidelines dealing with acute PE and selected options for further diagnostic testing after reviewing clinical data from three hypothetical patients presenting with low, intermediate, and high probability of acute PE. RESULTS: We received completed surveys from 240 physicians practicing in 44 states. Although most (98.3%) report that they assess pretest probability of PE before testing, slightly more than half do so routinely. A total of 72.5% prefer an unstructured approach to pretest assessment, whereas 22.9% use published prediction rules. Most (93.0%) are aware of at least one published guideline for assessing acute PE, but only 44.2% report using one or more in daily practice. Respondents who use published prediction rules, estimate pretest probability routinely, or use at least one practice guideline were more likely to request additional testing when reviewing a low probability clinical scenario. No differences in testing frequency or preferences were observed for intermediate or high probability clinical scenarios. CONCLUSIONS: The majority of clinicians we surveyed use an unstructured approach when estimating the pretest probability of acute PE. With the exception of low probability scenario, clinicians agreed on testing choices in suspected acute PE, regardless of the method or frequency of pre-test assessment.