Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Nat Med ; 28(3): 513-516, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35314819

RESUMEN

Preimplantation genetic testing (PGT) of in-vitro-fertilized embryos has been proposed as a method to reduce transmission of common disease; however, more comprehensive embryo genetic assessment, combining the effects of common variants and rare variants, remains unavailable. Here, we used a combination of molecular and statistical techniques to reliably infer inherited genome sequence in 110 embryos and model susceptibility across 12 common conditions. We observed a genotype accuracy of 99.0-99.4% at sites relevant to polygenic risk scoring in cases from day-5 embryo biopsies and 97.2-99.1% in cases from day-3 embryo biopsies. Combining rare variants with polygenic risk score (PRS) magnifies predicted differences across sibling embryos. For example, in a couple with a pathogenic BRCA1 variant, we predicted a 15-fold difference in odds ratio (OR) across siblings when combining versus a 4.5-fold or 3-fold difference with BRCA1 or PRS alone. Our findings may inform the discussion of utility and implementation of genome-based PGT in clinical practice.


Asunto(s)
Diagnóstico Preimplantación , Blastocisto , Embrión de Mamíferos , Femenino , Fertilización In Vitro , Pruebas Genéticas/métodos , Humanos , Embarazo , Diagnóstico Preimplantación/métodos
2.
Blood ; 135(26): 2337-2353, 2020 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-32157296

RESUMEN

Targeted therapies against the BCR-ABL1 kinase have revolutionized treatment of chronic phase (CP) chronic myeloid leukemia (CML). In contrast, management of blast crisis (BC) CML remains challenging because BC cells acquire complex molecular alterations that confer stemness features to progenitor populations and resistance to BCR-ABL1 tyrosine kinase inhibitors. Comprehensive models of BC transformation have proved elusive because of the rarity and genetic heterogeneity of BC, but are important for developing biomarkers predicting BC progression and effective therapies. To better understand BC, we performed an integrated multiomics analysis of 74 CP and BC samples using whole-genome and exome sequencing, transcriptome and methylome profiling, and chromatin immunoprecipitation followed by high-throughput sequencing. Employing pathway-based analysis, we found the BC genome was significantly enriched for mutations affecting components of the polycomb repressive complex (PRC) pathway. While transcriptomically, BC progenitors were enriched and depleted for PRC1- and PRC2-related gene sets respectively. By integrating our data sets, we determined that BC progenitors undergo PRC-driven epigenetic reprogramming toward a convergent transcriptomic state. Specifically, PRC2 directs BC DNA hypermethylation, which in turn silences key genes involved in myeloid differentiation and tumor suppressor function via so-called epigenetic switching, whereas PRC1 represses an overlapping and distinct set of genes, including novel BC tumor suppressors. On the basis of these observations, we developed an integrated model of BC that facilitated the identification of combinatorial therapies capable of reversing BC reprogramming (decitabine+PRC1 inhibitors), novel PRC-silenced tumor suppressor genes (NR4A2), and gene expression signatures predictive of disease progression and drug resistance in CP.


Asunto(s)
Crisis Blástica/genética , Regulación Leucémica de la Expresión Génica/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/patología , Complejo Represivo Polycomb 1/fisiología , Complejo Represivo Polycomb 2/fisiología , Diferenciación Celular , Inmunoprecipitación de Cromatina , Metilación de ADN , Conjuntos de Datos como Asunto , Proteína Potenciadora del Homólogo Zeste 2/fisiología , Dosificación de Gen , Ontología de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Mutación , Complejo Represivo Polycomb 1/genética , Complejo Represivo Polycomb 2/genética , Transcriptoma , Secuenciación del Exoma , Secuenciación Completa del Genoma
3.
Genet Med ; 21(9): 2103-2115, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-30967659

RESUMEN

PURPOSE: To identify the molecular cause in five unrelated families with a distinct autosomal dominant ocular systemic disorder we called ROSAH syndrome due to clinical features of retinal dystrophy, optic nerve edema, splenomegaly, anhidrosis, and migraine headache. METHODS: Independent discovery exome and genome sequencing in families 1, 2, and 3, and confirmation in families 4 and 5. Expression of wild-type messenger RNA and protein in human and mouse tissues and cell lines. Ciliary assays in fibroblasts from affected and unaffected family members. RESULTS: We found the heterozygous missense variant in the ɑ-kinase gene, ALPK1, (c.710C>T, [p.Thr237Met]), segregated with disease in all five families. All patients shared the ROSAH phenotype with additional low-grade ocular inflammation, pancytopenia, recurrent infections, and mild renal impairment in some. ALPK1 was notably expressed in retina, retinal pigment epithelium, and optic nerve, with immunofluorescence indicating localization to the basal body of the connecting cilium of the photoreceptors, and presence in the sweat glands. Immunocytofluorescence revealed expression at the centrioles and spindle poles during metaphase, and at the base of the primary cilium. Affected family member fibroblasts demonstrated defective ciliogenesis. CONCLUSION: Heterozygosity for ALPK1, p.Thr237Met leads to ROSAH syndrome, an autosomal dominant ocular systemic disorder.


Asunto(s)
Nervio Óptico/patología , Proteínas Quinasas/genética , Retina/metabolismo , Distrofias Retinianas/genética , Exoma/genética , Femenino , Heterocigoto , Humanos , Hipohidrosis/genética , Hipohidrosis/patología , Masculino , Trastornos Migrañosos/genética , Trastornos Migrañosos/patología , Mutación Missense/genética , Nervio Óptico/metabolismo , Linaje , Fenotipo , Retina/patología , Distrofias Retinianas/patología , Esplenomegalia/genética , Esplenomegalia/patología
4.
Bioinformatics ; 34(10): 1799-1800, 2018 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-29300845

RESUMEN

Summary: ChronQC is a quality control (QC) tracking system for clinical implementation of next-generation sequencing (NGS). ChronQC generates time series plots for various QC metrics to allow comparison of current runs to historical runs. ChronQC has multiple features for tracking QC data including Westgard rules for clinical validity, laboratory-defined thresholds and historical observations within a specified time period. Users can record their notes and corrective actions directly onto the plots for long-term recordkeeping. ChronQC facilitates regular monitoring of clinical NGS to enable adherence to high quality clinical standards. Availability and implementation: ChronQC is freely available on GitHub (https://github.com/nilesh-tawari/ChronQC), Docker (https://hub.docker.com/r/nileshtawari/chronqc/) and the Python Package Index. ChronQC is implemented in Python and runs on all common operating systems (Windows, Linux and Mac OS X). Contact: tawari.nilesh@gmail.com or pauline.c.ng@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Control de Calidad , Exactitud de los Datos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos
5.
J Mol Diagn ; 18(3): 416-424, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-26970585

RESUMEN

Targeted next-generation sequencing is becoming increasingly common as a clinical diagnostic and prognostic test for patient- and tumor-specific genetic profiles as well as to optimally select targeted therapies. Here, we describe a custom-developed, next-generation sequencing test for detecting single-nucleotide variants (SNVs) and short insertions and deletions (indels) in 93 genes related to gastrointestinal cancer from routine formalin-fixed, paraffin-embedded clinical specimens. We implemented a validation strategy, based on the College of American Pathologists requirements, using reference DNA mixtures from cell lines with known genetic variants, which model a broad range of allele frequencies. Test sensitivity achieved >99% for both SNVs and indels, with allele frequencies >10%, with high specificity (97.4% for SNVs and 93.6% for indels). We further confirmed test accuracies using primary formalin-fixed, paraffin-embedded colorectal cancer specimens characterized by alternative and conventional clinical diagnostic technologies. Robust performance was observed on the formalin-fixed, paraffin-embedded specimens: sensitivity was 97.2% and specificity was 99.2%. We also observed high intrarun and inter-run reproducibility, as well as a low cross-contamination rate. Overall assessment using cell line samples and formalin-fixed, paraffin-embedded samples showed that our custom next-generation sequencing assay has consistent detection sensitivity down to 10% variant frequency.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Gastrointestinales/diagnóstico , Neoplasias Gastrointestinales/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Mutación , Análisis Mutacional de ADN/métodos , Análisis Mutacional de ADN/normas , Humanos , Mutación INDEL , Polimorfismo de Nucleótido Simple , Valores de Referencia , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
6.
Nat Protoc ; 11(1): 1-9, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26633127

RESUMEN

The SIFT (sorting intolerant from tolerant) algorithm helps bridge the gap between mutations and phenotypic variations by predicting whether an amino acid substitution is deleterious. SIFT has been used in disease, mutation and genetic studies, and a protocol for its use has been previously published with Nature Protocols. This updated protocol describes SIFT 4G (SIFT for genomes), which is a faster version of SIFT that enables practical computations on reference genomes. Users can get predictions for single-nucleotide variants from their organism of interest using the SIFT 4G annotator with SIFT 4G's precomputed databases. The scope of genomic predictions is expanded, with predictions available for more than 200 organisms. Users can also run the SIFT 4G algorithm themselves. SIFT predictions can be retrieved for 6.7 million variants in 4 min once the database has been downloaded. If precomputed predictions are not available, the SIFT 4G algorithm can compute predictions at a rate of 2.6 s per protein sequence. SIFT 4G is available from http://sift-dna.org/sift4g.


Asunto(s)
Algoritmos , Genómica/métodos , Mutación Missense/genética , Bases de Datos de Proteínas , Genómica/normas , Humanos , Anotación de Secuencia Molecular , Fenotipo , Estándares de Referencia
7.
Per Med ; 13(4): 303-314, 2016 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-29749813

RESUMEN

BACKGROUND: Procedural guidelines for disclosure of incidental genomic information are lacking. METHODS: We introduce a method and evaluated the impact of returning results to population biobank participants with 16p11.2 copy number variants, which are commonly associated with neurodevelopmental disorders and BMI imbalance. Of the 7877 participants, 11 carriers were detected. Eight participants were informed of their carrier status and surveyed 11-17 months later. RESULTS: All participants demonstrated preference for disclosure. Although two participants experienced worry, all five survey respondents rated receiving this information favorably. One participant reported modifications in treatment and three felt that their treatment/condition had since improved. CONCLUSION: This approach can be adapted and applied for the return of incidental findings to biobank participants.

8.
Int J Epidemiol ; 44(4): 1137-47, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24518929

RESUMEN

The Estonian Biobank cohort is a volunteer-based sample of the Estonian resident adult population (aged ≥18 years). The current number of participants-close to 52000--represents a large proportion, 5%, of the Estonian adult population, making it ideally suited to population-based studies. General practitioners (GPs) and medical personnel in the special recruitment offices have recruited participants throughout the country. At baseline, the GPs performed a standardized health examination of the participants, who also donated blood samples for DNA, white blood cells and plasma tests and filled out a 16-module questionnaire on health-related topics such as lifestyle, diet and clinical diagnoses described in WHO ICD-10. A significant part of the cohort has whole genome sequencing (100), genome-wide single nucleotide polymorphism (SNP) array data (20 000) and/or NMR metabolome data (11 000) available (http://www.geenivaramu.ee/for-scientists/data-release/). The data are continuously updated through periodical linking to national electronic databases and registries. A part of the cohort has been re-contacted for follow-up purposes and resampling, and targeted invitations are possible for specific purposes, for example people with a specific diagnosis. The Estonian Genome Center of the University of Tartu is actively collaborating with many universities, research institutes and consortia and encourages fellow scientists worldwide to co-initiate new academic or industrial joint projects with us.


Asunto(s)
Bancos de Muestras Biológicas/tendencias , Genoma Humano/genética , Clasificación Internacional de Enfermedades/normas , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Bases de Datos Factuales , Estonia , Femenino , Humanos , Estilo de Vida , Masculino , Persona de Mediana Edad , Salud Pública , Adulto Joven
9.
Cancer Res ; 74(21): 6071-81, 2014 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-25189529

RESUMEN

Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status.


Asunto(s)
Daño del ADN/genética , Neoplasias Pulmonares/epidemiología , Neoplasias Pulmonares/genética , Contaminación por Humo de Tabaco/efectos adversos , Pueblo Asiatico/genética , Receptores ErbB/genética , Femenino , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias Pulmonares/patología , Masculino , Persona de Mediana Edad , Mutación , Factores de Riesgo , Proteína p53 Supresora de Tumor/genética
10.
Nat Methods ; 11(9): 935-7, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25086502

RESUMEN

We introduce Phen-Gen, a method that combines patients' disease symptoms and sequencing data with prior domain knowledge to identify the causative genes for rare disorders. Simulations revealed that the causal variant was ranked first in 88% of cases when it was a coding variant-a 52% advantage over a genotype-only approach-and Phen-Gen outperformed other existing prediction methods by 13-58%. If disease etiology was unknown, the causal variant was assigned the top rank in 71% of simulations. Phen-Gen is available at http://phen-gen.org/.


Asunto(s)
Mapeo Cromosómico/métodos , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad/genética , Pruebas Genéticas/métodos , Genoma Humano/genética , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Minería de Datos/métodos , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos
11.
PLoS One ; 8(10): e77940, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24194902

RESUMEN

Indels in the coding regions of a gene can either cause frameshifts or amino acid insertions/deletions. Frameshifting indels are indels that have a length that is not divisible by 3 and subsequently cause frameshifts. Indels that have a length divisible by 3 cause amino acid insertions/deletions or block substitutions; we call these 3n indels. The new amino acid changes resulting from 3n indels could potentially affect protein function. Therefore, we construct a SIFT Indel prediction algorithm for 3n indels which achieves 82% accuracy, 81% sensitivity, 82% specificity, 82% precision, 0.63 MCC, and 0.87 AUC by 10-fold cross-validation. We have previously published a prediction algorithm for frameshifting indels. The rules for the prediction of 3n indels are different from the rules for the prediction of frameshifting indels and reflect the biological differences of these two different types of variations. SIFT Indel was applied to human 3n indels from the 1000 Genomes Project and the Exome Sequencing Project. We found that common variants are less likely to be deleterious than rare variants. The SIFT indel prediction algorithm for 3n indels is available at http://sift-dna.org/


Asunto(s)
Algoritmos , Secuencia de Aminoácidos/genética , Mutación INDEL/genética , Modelos Genéticos , Proteínas/genética , Área Bajo la Curva , Humanos , Sensibilidad y Especificidad
12.
J Psychopharmacol ; 27(10): 915-20, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23926243

RESUMEN

Although antidepressants are widely used in the pharmacotherapy of major depressive disorder (MDD), their efficacy is still insufficient as approximately one-third of the patients do not fully recover even after several treatment trials. Inter-individual genetic differences are thought to contribute to the variability in antidepressant response; however, current findings from pharmacogenetic studies are uncertain or not clearly replicated. Here we report the first application of full exome sequencing for the analysis of pharmacogenomics on antidepressant treatment. After 12 weeks of treatment with the selective serotonin re-uptake inhibitor escitalopram, we selected five clear responders and five clear non-responders for exome sequencing. By comparing the allele counts of previously known single nucleotide polymorphisms and novel polymorphisms we selected 38 markers for further genotyping in two independent patient samples treated with escitalopram (n=116 and n=394). The A allele, carried by approximately 30% of the patients with MDD, of rs41271330 in the bone morphogenetic protein (BMP5) gene showed strong association with worse treatment response in both sample sets (p=0.001), indicating that this is an promising pharmacogenetic marker for prediction of antidepressant therapeutic outcome.


Asunto(s)
Proteína Morfogenética Ósea 5/genética , Citalopram/uso terapéutico , Trastorno Depresivo Mayor/tratamiento farmacológico , Trastorno Depresivo Mayor/genética , Exoma/genética , Adulto , Alelos , Femenino , Genotipo , Humanos , Masculino , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia de ADN , Inhibidores Selectivos de la Recaptación de Serotonina/uso terapéutico , Resultado del Tratamiento
13.
Genome Med ; 4(11): 88, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23181697

RESUMEN

Genomic variants with a key role in causing cancer or affecting the response to cancer therapeutics need to be identified so that they can be targeted for therapy. The transFIC tool aims to identify somatic point mutations that drive cancer in sequencing projects. This package is available as a web service, a stand-alone program and a website. It improves the functional prediction scores generated by popular established prediction tools and will be useful to cancer researchers. SEE RESEARCH ARTICLE: http://genomemedicine.com/content/4/11/89.

14.
Nucleic Acids Res ; 40(Web Server issue): W452-7, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22689647

RESUMEN

The Sorting Intolerant from Tolerant (SIFT) algorithm predicts the effect of coding variants on protein function. It was first introduced in 2001, with a corresponding website that provides users with predictions on their variants. Since its release, SIFT has become one of the standard tools for characterizing missense variation. We have updated SIFT's genome-wide prediction tool since our last publication in 2009, and added new features to the insertion/deletion (indel) tool. We also show accuracy metrics on independent data sets. The original developers have hosted the SIFT web server at FHCRC, JCVI and the web server is currently located at BII. The URL is http://sift-dna.org (24 May 2012, date last accessed).


Asunto(s)
Sustitución de Aminoácidos , Proteínas/química , Programas Informáticos , Algoritmos , Variación Genética , Humanos , Mutación INDEL , Internet , Proteínas/genética , Proteínas/metabolismo , Análisis de Secuencia de Proteína
15.
Genome Biol ; 13(2): R9, 2012 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-22322200

RESUMEN

Each human has approximately 50 to 280 frameshifting indels, yet their implications are unknown. We created SIFT Indel, a prediction method for frameshifting indels that has 84% accuracy. The percentage of human frameshifting indels predicted to be gene-damaging is negatively correlated with allele frequency. We also show that although the first frameshifting indel in a gene causes loss of function, there is a tendency for the second frameshifting indel to compensate and restore protein function. SIFT Indel is available at http://sift-dna.org/www/SIFT_indels2.html.


Asunto(s)
Biología Computacional , Mutación del Sistema de Lectura/genética , Mutación INDEL/genética , Proteínas/genética , Programas Informáticos , Algoritmos , Animales , Pueblo Asiatico/genética , Población Negra/genética , Frecuencia de los Genes , Genoma Humano , Humanos , Población Blanca/genética
16.
J Immunol ; 186(7): 4285-94, 2011 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-21383244

RESUMEN

The human naive T cell repertoire is the repository of a vast array of TCRs. However, the factors that shape their hierarchical distribution and relationship with the memory repertoire remain poorly understood. In this study, we used polychromatic flow cytometry to isolate highly pure memory and naive CD8(+) T cells, stringently defined with multiple phenotypic markers, and used deep sequencing to characterize corresponding portions of their respective TCR repertoires from four individuals. The extent of interindividual TCR sharing and the overlap between the memory and naive compartments within individuals were determined by TCR clonotype frequencies, such that higher-frequency clonotypes were more commonly shared between compartments and individuals. TCR clonotype frequencies were, in turn, predicted by the efficiency of their production during V(D)J recombination. Thus, convergent recombination shapes the TCR repertoire of the memory and naive T cell pools, as well as their interrelationship within and between individuals.


Asunto(s)
Reordenamiento Génico de la Cadena beta de los Receptores de Antígenos de los Linfocitos T/inmunología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Receptores de Antígenos de Linfocitos T alfa-beta/metabolismo , Subgrupos de Linfocitos T/inmunología , Subgrupos de Linfocitos T/metabolismo , Adulto , Células Clonales , Humanos , Región Variable de Inmunoglobulina/biosíntesis , Región Variable de Inmunoglobulina/genética , Región Variable de Inmunoglobulina/aislamiento & purificación , Memoria Inmunológica/genética , Masculino , Persona de Mediana Edad , Receptores de Antígenos de Linfocitos T alfa-beta/genética , Receptores de Antígenos de Linfocitos T alfa-beta/aislamiento & purificación , Recombinación Genética/inmunología , Fase de Descanso del Ciclo Celular/genética , Fase de Descanso del Ciclo Celular/inmunología , Subgrupos de Linfocitos T/citología , Adulto Joven
17.
Methods Mol Biol ; 628: 215-26, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20238084

RESUMEN

Whole genome sequencing provides the most comprehensive collection of an individual's genetic variation. With the falling costs of sequencing technology, we envision paradigm shift from microarray-based genotyping studies to whole genome sequencing. We review methodologies for whole genome sequencing. There are two approaches for assembling short shotgun sequence reads into longer contiguous genomic sequences. In the de novo assembly approach, sequence reads are compared to each other, and then overlapped to build longer contiguous sequences. The reference-based assembly approach involves mapping each read to a reference genome sequence. We discuss methods for identifying genetic variation (single nucleotide polymorphisms, small indels, and copy number variants) and building haplotypes from genome assemblies, and discuss potential pitfalls. We expect methodologies to evolve rapidly as sequencing technologies improve and more human genomes are sequenced.


Asunto(s)
Genoma Humano , Análisis de Secuencia de ADN/métodos , Variación Genética , Genómica , Humanos
19.
Nat Protoc ; 4(7): 1073-81, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19561590

RESUMEN

The effect of genetic mutation on phenotype is of significant interest in genetics. The type of genetic mutation that causes a single amino acid substitution (AAS) in a protein sequence is called a non-synonymous single nucleotide polymorphism (nsSNP). An nsSNP could potentially affect the function of the protein, subsequently altering the carrier's phenotype. This protocol describes the use of the 'Sorting Tolerant From Intolerant' (SIFT) algorithm in predicting whether an AAS affects protein function. To assess the effect of a substitution, SIFT assumes that important positions in a protein sequence have been conserved throughout evolution and therefore substitutions at these positions may affect protein function. Thus, by using sequence homology, SIFT predicts the effects of all possible substitutions at each position in the protein sequence. The protocol typically takes 5-20 min, depending on the input. SIFT is available as an online tool (http://sift.jcvi.org).


Asunto(s)
Algoritmos , Sustitución de Aminoácidos , Proteínas/genética , Programas Informáticos , Secuencia de Aminoácidos , Simulación por Computador , Internet , Datos de Secuencia Molecular , Fenotipo , Proteínas/fisiología
20.
Genome Biol ; 10(3): R32, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19327155

RESUMEN

BACKGROUND: Next generation sequencing (NGS) platforms are currently being utilized for targeted sequencing of candidate genes or genomic intervals to perform sequence-based association studies. To evaluate these platforms for this application, we analyzed human sequence generated by the Roche 454, Illumina GA, and the ABI SOLiD technologies for the same 260 kb in four individuals. RESULTS: Local sequence characteristics contribute to systematic variability in sequence coverage (>100-fold difference in per-base coverage), resulting in patterns for each NGS technology that are highly correlated between samples. A comparison of the base calls to 88 kb of overlapping ABI 3730xL Sanger sequence generated for the same samples showed that the NGS platforms all have high sensitivity, identifying >95% of variant sites. At high coverage, depth base calling errors are systematic, resulting from local sequence contexts; as the coverage is lowered additional 'random sampling' errors in base calling occur. CONCLUSIONS: Our study provides important insights into systematic biases and data variability that need to be considered when utilizing NGS platforms for population targeted sequencing studies.


Asunto(s)
Genética de Población , Análisis de Secuencia de ADN/instrumentación , Secuencia de Bases , Simulación por Computador , Reacciones Falso Positivas , Genotipo , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple/genética , Alineación de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA