Búsqueda | Portal de Búsqueda de la BVS Ecuador

1.

Analysis of protein-coding genetic variation in 60,706 humans.

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M.

Nature ; 536(7616): 285-91, 2016 08 18.

Artículo en Inglés | MEDLINE | ID: mdl-27535533

RESUMEN

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

Asunto(s)

Exoma/genética , Variación Genética/genética , Análisis Mutacional de ADN , Conjuntos de Datos como Asunto , Humanos , Fenotipo , Proteoma/genética , Enfermedades Raras/genética , Tamaño de la Muestra

2.

De novo mutations in schizophrenia implicate synaptic networks.

Fromer, Menachem; Pocklington, Andrew J; Kavanagh, David H; Williams, Hywel J; Dwyer, Sarah; Gormley, Padhraig; Georgieva, Lyudmila; Rees, Elliott; Palta, Priit; Ruderfer, Douglas M; Carrera, Noa; Humphreys, Isla; Johnson, Jessica S; Roussos, Panos; Barker, Douglas D; Banks, Eric; Milanova, Vihra; Grant, Seth G; Hannon, Eilis; Rose, Samuel A; Chambert, Kimberly; Mahajan, Milind; Scolnick, Edward M; Moran, Jennifer L; Kirov, George; Palotie, Aarno; McCarroll, Steven A; Holmans, Peter; Sklar, Pamela; Owen, Michael J; Purcell, Shaun M; O'Donovan, Michael C.

Nature ; 506(7487): 179-84, 2014 Feb 13.

Artículo en Inglés | MEDLINE | ID: mdl-24463507

RESUMEN

Inherited alleles account for most of the genetic risk for schizophrenia. However, new (de novo) mutations, in the form of large chromosomal copy number changes, occur in a small fraction of cases and disproportionally disrupt genes encoding postsynaptic proteins. Here we show that small de novo mutations, affecting one or a few nucleotides, are overrepresented among glutamatergic postsynaptic proteins comprising activity-regulated cytoskeleton-associated protein (ARC) and N-methyl-d-aspartate receptor (NMDAR) complexes. Mutations are additionally enriched in proteins that interact with these complexes to modulate synaptic strength, namely proteins regulating actin filament dynamics and those whose messenger RNAs are targets of fragile X mental retardation protein (FMRP). Genes affected by mutations in schizophrenia overlap those mutated in autism and intellectual disability, as do mutation-enriched synaptic pathways. Aligning our findings with a parallel case-control study, we demonstrate reproducible insights into aetiological mechanisms for schizophrenia and reveal pathophysiology shared with other neurodevelopmental disorders.

Asunto(s)

Modelos Neurológicos , Mutación/genética , Red Nerviosa/metabolismo , Vías Nerviosas/metabolismo , Esquizofrenia/genética , Esquizofrenia/fisiopatología , Sinapsis/metabolismo , Trastornos Generalizados del Desarrollo Infantil/genética , Proteínas del Citoesqueleto/metabolismo , Exoma/genética , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/metabolismo , Humanos , Discapacidad Intelectual/genética , Tasa de Mutación , Red Nerviosa/fisiopatología , Proteínas del Tejido Nervioso/metabolismo , Vías Nerviosas/fisiopatología , Fenotipo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Receptores de N-Metil-D-Aspartato/metabolismo , Esquizofrenia/metabolismo , Especificidad por Sustrato

3.

A polygenic burden of rare disruptive mutations in schizophrenia.

Purcell, Shaun M; Moran, Jennifer L; Fromer, Menachem; Ruderfer, Douglas; Solovieff, Nadia; Roussos, Panos; O'Dushlaine, Colm; Chambert, Kimberly; Bergen, Sarah E; Kähler, Anna; Duncan, Laramie; Stahl, Eli; Genovese, Giulio; Fernández, Esperanza; Collins, Mark O; Komiyama, Noboru H; Choudhary, Jyoti S; Magnusson, Patrik K E; Banks, Eric; Shakir, Khalid; Garimella, Kiran; Fennell, Tim; DePristo, Mark; Grant, Seth G N; Haggarty, Stephen J; Gabriel, Stacey; Scolnick, Edward M; Lander, Eric S; Hultman, Christina M; Sullivan, Patrick F; McCarroll, Steven A; Sklar, Pamela.

Nature ; 506(7487): 185-90, 2014 Feb 13.

Artículo en Inglés | MEDLINE | ID: mdl-24463508

RESUMEN

Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.

Asunto(s)

Herencia Multifactorial/genética , Mutación/genética , Esquizofrenia/genética , Trastorno Autístico/genética , Canales de Calcio/genética , Proteínas del Citoesqueleto/genética , Variaciones en el Número de Copia de ADN/genética , Homólogo 4 de la Proteína Discs Large , Femenino , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/metabolismo , Estudio de Asociación del Genoma Completo , Humanos , Discapacidad Intelectual/genética , Péptidos y Proteínas de Señalización Intracelular/genética , Masculino , Proteínas de la Membrana/genética , Proteínas del Tejido Nervioso/genética , Receptores de N-Metil-D-Aspartato/genética

4.

Synaptic, transcriptional and chromatin genes disrupted in autism.

De Rubeis, Silvia; He, Xin; Goldberg, Arthur P; Poultney, Christopher S; Samocha, Kaitlin; Cicek, A Erucment; Kou, Yan; Liu, Li; Fromer, Menachem; Walker, Susan; Singh, Tarinder; Klei, Lambertus; Kosmicki, Jack; Shih-Chen, Fu; Aleksic, Branko; Biscaldi, Monica; Bolton, Patrick F; Brownfeld, Jessica M; Cai, Jinlu; Campbell, Nicholas G; Carracedo, Angel; Chahrour, Maria H; Chiocchetti, Andreas G; Coon, Hilary; Crawford, Emily L; Curran, Sarah R; Dawson, Geraldine; Duketis, Eftichia; Fernandez, Bridget A; Gallagher, Louise; Geller, Evan; Guter, Stephen J; Hill, R Sean; Ionita-Laza, Juliana; Jimenz Gonzalez, Patricia; Kilpinen, Helena; Klauck, Sabine M; Kolevzon, Alexander; Lee, Irene; Lei, Irene; Lei, Jing; Lehtimäki, Terho; Lin, Chiao-Feng; Ma'ayan, Avi; Marshall, Christian R; McInnes, Alison L; Neale, Benjamin; Owen, Michael J; Ozaki, Noriio; Parellada, Mara.

Nature ; 515(7526): 209-15, 2014 Nov 13.

Artículo en Inglés | MEDLINE | ID: mdl-25363760

RESUMEN

The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.

Asunto(s)

Trastornos Generalizados del Desarrollo Infantil/genética , Cromatina/genética , Predisposición Genética a la Enfermedad/genética , Mutación/genética , Sinapsis/metabolismo , Transcripción Genética/genética , Secuencia de Aminoácidos , Trastornos Generalizados del Desarrollo Infantil/patología , Cromatina/metabolismo , Ensamble y Desensamble de Cromatina , Exoma/genética , Femenino , Mutación de Línea Germinal/genética , Humanos , Masculino , Datos de Secuencia Molecular , Mutación Missense/genética , Red Nerviosa/metabolismo , Oportunidad Relativa

5.

Patterns and rates of exonic de novo mutations in autism spectrum disorders.

Neale, Benjamin M; Kou, Yan; Liu, Li; Ma'ayan, Avi; Samocha, Kaitlin E; Sabo, Aniko; Lin, Chiao-Feng; Stevens, Christine; Wang, Li-San; Makarov, Vladimir; Polak, Paz; Yoon, Seungtai; Maguire, Jared; Crawford, Emily L; Campbell, Nicholas G; Geller, Evan T; Valladares, Otto; Schafer, Chad; Liu, Han; Zhao, Tuo; Cai, Guiqing; Lihm, Jayon; Dannenfelser, Ruth; Jabado, Omar; Peralta, Zuleyma; Nagaswamy, Uma; Muzny, Donna; Reid, Jeffrey G; Newsham, Irene; Wu, Yuanqing; Lewis, Lora; Han, Yi; Voight, Benjamin F; Lim, Elaine; Rossin, Elizabeth; Kirby, Andrew; Flannick, Jason; Fromer, Menachem; Shakir, Khalid; Fennell, Tim; Garimella, Kiran; Banks, Eric; Poplin, Ryan; Gabriel, Stacey; DePristo, Mark; Wimbish, Jack R; Boone, Braden E; Levy, Shawn E; Betancur, Catalina; Sunyaev, Shamil.

Nature ; 485(7397): 242-5, 2012 Apr 04.

Artículo en Inglés | MEDLINE | ID: mdl-22495311

RESUMEN

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.

Asunto(s)

Trastorno Autístico/genética , Proteínas de Unión al ADN/genética , Exones/genética , Predisposición Genética a la Enfermedad/genética , Mutación/genética , Factores de Transcripción/genética , Estudios de Casos y Controles , Exoma/genética , Salud de la Familia , Humanos , Modelos Genéticos , Herencia Multifactorial/genética , Fenotipo , Distribución de Poisson , Mapas de Interacción de Proteínas

6.

Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence.

Genovese, Giulio; Kähler, Anna K; Handsaker, Robert E; Lindberg, Johan; Rose, Samuel A; Bakhoum, Samuel F; Chambert, Kimberly; Mick, Eran; Neale, Benjamin M; Fromer, Menachem; Purcell, Shaun M; Svantesson, Oscar; Landén, Mikael; Höglund, Martin; Lehmann, Sören; Gabriel, Stacey B; Moran, Jennifer L; Lander, Eric S; Sullivan, Patrick F; Sklar, Pamela; Grönberg, Henrik; Hultman, Christina M; McCarroll, Steven A.

N Engl J Med ; 371(26): 2477-87, 2014 Dec 25.

Artículo en Inglés | MEDLINE | ID: mdl-25426838

RESUMEN

BACKGROUND: Cancers arise from multiple acquired mutations, which presumably occur over many years. Early stages in cancer development might be present years before cancers become clinically apparent. METHODS: We analyzed data from whole-exome sequencing of DNA in peripheral-blood cells from 12,380 persons, unselected for cancer or hematologic phenotypes. We identified somatic mutations on the basis of unusual allelic fractions. We used data from Swedish national patient registers to follow health outcomes for 2 to 7 years after DNA sampling. RESULTS: Clonal hematopoiesis with somatic mutations was observed in 10% of persons older than 65 years of age but in only 1% of those younger than 50 years of age. Detectable clonal expansions most frequently involved somatic mutations in three genes (DNMT3A, ASXL1, and TET2) that have previously been implicated in hematologic cancers. Clonal hematopoiesis was a strong risk factor for subsequent hematologic cancer (hazard ratio, 12.9; 95% confidence interval, 5.8 to 28.7). Approximately 42% of hematologic cancers in this cohort arose in persons who had clonality at the time of DNA sampling, more than 6 months before a first diagnosis of cancer. Analysis of bone marrow-biopsy specimens obtained from two patients at the time of diagnosis of acute myeloid leukemia revealed that their cancers arose from the earlier clones. CONCLUSIONS: Clonal hematopoiesis with somatic mutations is readily detected by means of DNA sequencing, is increasingly common as people age, and is associated with increased risks of hematologic cancer and death. A subset of the genes that are mutated in patients with myeloid cancers is frequently mutated in apparently healthy persons; these mutations may represent characteristic early events in the development of hematologic cancers. (Funded by the National Human Genome Research Institute and others.).

Asunto(s)

Sangre , Transformación Celular Neoplásica/genética , Neoplasias Hematológicas/genética , Hematopoyesis/fisiología , Células Madre Hematopoyéticas/fisiología , Mutación , Adulto , Factores de Edad , Anciano , Anciano de 80 o más Años , Células Clonales , Análisis Mutacional de ADN , Exoma , Neoplasias Hematológicas/fisiopatología , Humanos , Persona de Mediana Edad , Factores de Riesgo , Adulto Joven

7.

Identification of small exonic CNV from whole-exome sequence data and application to autism spectrum disorder.

Poultney, Christopher S; Goldberg, Arthur P; Drapeau, Elodie; Kou, Yan; Harony-Nicolas, Hala; Kajiwara, Yuji; De Rubeis, Silvia; Durand, Simon; Stevens, Christine; Rehnström, Karola; Palotie, Aarno; Daly, Mark J; Ma'ayan, Avi; Fromer, Menachem; Buxbaum, Joseph D.

Am J Hum Genet ; 93(4): 607-19, 2013 Oct 03.

Artículo en Inglés | MEDLINE | ID: mdl-24094742

RESUMEN

Copy number variation (CNV) is an important determinant of human diversity and plays important roles in susceptibility to disease. Most studies of CNV carried out to date have made use of chromosome microarray and have had a lower size limit for detection of about 30 kilobases (kb). With the emergence of whole-exome sequencing studies, we asked whether such data could be used to reliably call rare exonic CNV in the size range of 1-30 kilobases (kb), making use of the eXome Hidden Markov Model (XHMM) program. By using both transmission information and validation by molecular methods, we confirmed that small CNV encompassing as few as three exons can be reliably called from whole-exome data. We applied this approach to an autism case-control sample (n = 811, mean per-target read depth = 161) and observed a significant increase in the burden of rare (MAF ≤1%) 1-30 kb CNV, 1-30 kb deletions, and 1-10 kb deletions in ASD. CNV in the 1-30 kb range frequently hit just a single gene, and we were therefore able to carry out enrichment and pathway analyses, where we observed enrichment for disruption of genes in cytoskeletal and autophagy pathways in ASD. In summary, our results showed that XHMM provided an effective means to assess small exonic CNV from whole-exome data, indicated that rare 1-30 kb exonic deletions could contribute to risk in up to 7% of individuals with ASD, and implicated a candidate pathway in developmental delay syndromes.

Asunto(s)

Trastornos Generalizados del Desarrollo Infantil/genética , Variaciones en el Número de Copia de ADN , Exoma , Autofagia/genética , Secuencia de Bases , Estudios de Casos y Controles , Niño , Exones , Eliminación de Gen , Predisposición Genética a la Enfermedad , Humanos , Datos de Secuencia Molecular , Análisis de Secuencia de ADN/métodos

8.

Increased frequency of de novo copy number variants in congenital heart disease by integrative analysis of single nucleotide polymorphism array and exome sequence data.

Glessner, Joseph T; Bick, Alexander G; Ito, Kaoru; Homsy, Jason; Rodriguez-Murillo, Laura; Fromer, Menachem; Mazaika, Erica; Vardarajan, Badri; Italia, Michael; Leipzig, Jeremy; DePalma, Steven R; Golhar, Ryan; Sanders, Stephan J; Yamrom, Boris; Ronemus, Michael; Iossifov, Ivan; Willsey, A Jeremy; State, Matthew W; Kaltman, Jonathan R; White, Peter S; Shen, Yufeng; Warburton, Dorothy; Brueckner, Martina; Seidman, Christine; Goldmuntz, Elizabeth; Gelb, Bruce D; Lifton, Richard; Seidman, Jonathan; Hakonarson, Hakon; Chung, Wendy K.

Circ Res ; 115(10): 884-896, 2014 Oct 24.

Artículo en Inglés | MEDLINE | ID: mdl-25205790

RESUMEN

RATIONALE: Congenital heart disease (CHD) is among the most common birth defects. Most cases are of unknown pathogenesis. OBJECTIVE: To determine the contribution of de novo copy number variants (CNVs) in the pathogenesis of sporadic CHD. METHODS AND RESULTS: We studied 538 CHD trios using genome-wide dense single nucleotide polymorphism arrays and whole exome sequencing. Results were experimentally validated using digital droplet polymerase chain reaction. We compared validated CNVs in CHD cases with CNVs in 1301 healthy control trios. The 2 complementary high-resolution technologies identified 63 validated de novo CNVs in 51 CHD cases. A significant increase in CNV burden was observed when comparing CHD trios with healthy trios, using either single nucleotide polymorphism array (P=7×10(-5); odds ratio, 4.6) or whole exome sequencing data (P=6×10(-4); odds ratio, 3.5) and remained after removing 16% of de novo CNV loci previously reported as pathogenic (P=0.02; odds ratio, 2.7). We observed recurrent de novo CNVs on 15q11.2 encompassing CYFIP1, NIPA1, and NIPA2 and single de novo CNVs encompassing DUSP1, JUN, JUP, MED15, MED9, PTPRE SREBF1, TOP2A, and ZEB2, genes that interact with established CHD proteins NKX2-5 and GATA4. Integrating de novo variants in whole exome sequencing and CNV data suggests that ETS1 is the pathogenic gene altered by 11q24.2-q25 deletions in Jacobsen syndrome and that CTBP2 is the pathogenic gene in 10q subtelomeric deletions. CONCLUSIONS: We demonstrate a significantly increased frequency of rare de novo CNVs in CHD patients compared with healthy controls and suggest several novel genetic loci for CHD.

Asunto(s)

Variaciones en el Número de Copia de ADN/genética , Exoma/genética , Frecuencia de los Genes/genética , Cardiopatías Congénitas/genética , Polimorfismo de Nucleótido Simple/genética , Estudios de Casos y Controles , Estudios de Cohortes , Redes Reguladoras de Genes/genética , Cardiopatías Congénitas/diagnóstico , Humanos , Datos de Secuencia Molecular

9.

Genomic aberrations in cervical adenocarcinomas in Hong Kong Chinese women.

Chung, Tony K H; Van Hummelen, Paul; Chan, Paul K S; Cheung, Tak Hong; Yim, So Fan; Yu, Mei Y; Ducar, Matthew D; Thorner, Aaron R; MacConaill, Laura E; Doran, Graeme; Pedamallu, Chandra Sekhar; Ojesina, Akinyemi I; Wong, Raymond R Y; Wang, Vivian W; Freeman, Samuel S; Lau, Tat San; Kwong, Joseph; Chan, Loucia K Y; Fromer, Menachem; May, Taymaa; Worley, Michael J; Esselen, Katharine M; Elias, Kevin M; Lawrence, Michael; Getz, Gad; Smith, David I; Crum, Christopher P; Meyerson, Matthew; Berkowitz, Ross S; Wong, Yick Fu.

Int J Cancer ; 137(4): 776-83, 2015 Aug 15.

Artículo en Inglés | MEDLINE | ID: mdl-25626421

RESUMEN

Although the rates of cervical squamous cell carcinoma have been declining, the rates of cervical adenocarcinoma are increasing in some countries. Outcomes for advanced cervical adenocarcinoma remain poor. Precision mapping of genetic alterations in cervical adenocarcinoma may enable better selection of therapies and deliver improved outcomes when combined with new sequencing diagnostics. We present whole-exome sequencing results from 15 cervical adenocarcinomas and paired normal samples from Hong Kong Chinese women. These data revealed a heterogeneous mutation spectrum and identified several frequently altered genes including FAT1, ARID1A, ERBB2 and PIK3CA. Exome sequencing identified human papillomavirus (HPV) sequences in 13 tumors in which the HPV genome might have integrated into and hence disrupted the functions of certain exons, raising the possibility that HPV integration can alter pathways other than p53 and pRb. Together, these provisionary data suggest the potential for individualized therapies for cervical adenocarcinoma based on genomic information.

Asunto(s)

Adenocarcinoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Neoplasias del Cuello Uterino/genética , Adenocarcinoma/patología , Adenocarcinoma/virología , Adulto , Anciano , Exoma , Femenino , Hong Kong , Humanos , Persona de Mediana Edad , Mutación , Estadificación de Neoplasias , Papillomaviridae/genética , Papillomaviridae/patogenicidad , Neoplasias del Cuello Uterino/patología , Neoplasias del Cuello Uterino/virología

10.

Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

Fromer, Menachem; Moran, Jennifer L; Chambert, Kimberly; Banks, Eric; Bergen, Sarah E; Ruderfer, Douglas M; Handsaker, Robert E; McCarroll, Steven A; O'Donovan, Michael C; Owen, Michael J; Kirov, George; Sullivan, Patrick F; Hultman, Christina M; Sklar, Pamela; Purcell, Shaun M.

Am J Hum Genet ; 91(4): 597-607, 2012 Oct 05.

Artículo en Inglés | MEDLINE | ID: mdl-23040492

RESUMEN

Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene.

Asunto(s)

Variaciones en el Número de Copia de ADN , Exoma , Exones , Estudio de Asociación del Genoma Completo/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Estudios de Casos y Controles , Genotipo , Técnicas de Genotipaje/métodos , Humanos , Modelos Genéticos , Hibridación de Ácido Nucleico/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos

11.

zCall: a rare variant caller for array-based genotyping: genetics and population analysis.

Goldstein, Jacqueline I; Crenshaw, Andrew; Carey, Jason; Grant, George B; Maguire, Jared; Fromer, Menachem; O'Dushlaine, Colm; Moran, Jennifer L; Chambert, Kimberly; Stevens, Christine; Sklar, Pamela; Hultman, Christina M; Purcell, Shaun; McCarroll, Steven A; Sullivan, Patrick F; Daly, Mark J; Neale, Benjamin M.

Bioinformatics ; 28(19): 2543-5, 2012 Oct 01.

Artículo en Inglés | MEDLINE | ID: mdl-22843986

RESUMEN

SUMMARY: zCall is a variant caller specifically designed for calling rare single-nucleotide polymorphisms from array-based technology. This caller is implemented as a post-processing step after a default calling algorithm has been applied. The algorithm uses the intensity profile of the common allele homozygote cluster to define the location of the other two genotype clusters. We demonstrate improved detection of rare alleles when applying zCall to samples that have both Illumina Infinium HumanExome BeadChip and exome sequencing data available. AVAILABILITY: http://atguweb.mgh.harvard.edu/apps/zcall. CONTACT: bneale@broadinstitute.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Técnicas de Genotipaje , Polimorfismo de Nucleótido Simple , Programas Informáticos , Alelos , Análisis por Conglomerados , Exoma , Homocigoto , Humanos

12.

Recovering key biological constituents through sparse representation of gene expression.

Prat, Yosef; Fromer, Menachem; Linial, Nathan; Linial, Michal.

Bioinformatics ; 27(5): 655-61, 2011 Mar 01.

Artículo en Inglés | MEDLINE | ID: mdl-21258061

RESUMEN

MOTIVATION: Large-scale RNA expression measurements are generating enormous quantities of data. During the last two decades, many methods were developed for extracting insights regarding the interrelationships between genes from such data. The mathematical and computational perspectives that underlie these methods are usually algebraic or probabilistic. RESULTS: Here, we introduce an unexplored geometric view point where expression levels of genes in multiple experiments are interpreted as vectors in a high-dimensional space. Specifically, we find, for the expression profile of each particular gene, its approximation as a linear combination of profiles of a few other genes. This method is inspired by recent developments in the realm of compressed sensing in the machine learning domain. To demonstrate the power of our approach in extracting valuable information from the expression data, we independently applied it to large-scale experiments carried out on the yeast and malaria parasite whole transcriptomes. The parameters extracted from the sparse reconstruction of the expression profiles, when fed to a supervised learning platform, were used to successfully predict the relationships between genes throughout the Gene Ontology hierarchy and protein-protein interaction map. Extensive assessment of the biological results shows high accuracy in both recovering known predictions and in yielding accurate predictions missing from the current databases. We suggest that the geometrical approach presented here is suitable for a broad range of high-dimensional experimental data.

Asunto(s)

Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Inteligencia Artificial , Plasmodium falciparum/genética , ARN de Hongos/genética , ARN Protozoario/genética , Saccharomyces cerevisiae/genética

13.

PANDORA: analysis of protein and peptide sets through the hierarchical integration of annotations.

Rappoport, Nadav; Fromer, Menachem; Schweiger, Regev; Linial, Michal.

Nucleic Acids Res ; 38(Web Server issue): W84-9, 2010 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-20444873

RESUMEN

Derivation of biological meaning from large sets of proteins or genes is a frequent task in genomic and proteomic studies. Such sets often arise from experimental methods including large-scale gene expression experiments and mass spectrometry (MS) proteomics. Large sets of genes or proteins are also the outcome of computational methods such as BLAST search and homology-based classifications. We have developed the PANDORA web server, which functions as a platform for the advanced biological analysis of sets of genes, proteins, or proteolytic peptides. First, the input set is mapped to a set of corresponding proteins. Then, an analysis of the protein set produces a graph-based hierarchy which highlights intrinsic relations amongst biological subsets, in light of their different annotations from multiple annotation resources. PANDORA integrates a large collection of annotation sources (GO, UniProt Keywords, InterPro, Enzyme, SCOP, CATH, Gene-3D, NCBI taxonomy and more) that comprise approximately 200,000 different annotation terms associated with approximately 3.2 million sequences from UniProtKB. Statistical enrichment based on a binomial approximation of the hypergeometric distribution and corrected for multiple hypothesis tests is calculated using several background sets, including major gene-expression DNA-chip platforms. Users can also visualize either standard or user-defined binary and quantitative properties alongside the proteins. PANDORA 4.2 is available at http://www.pandora.cs.huji.ac.il.

Asunto(s)

Péptidos/química , Péptidos/metabolismo , Proteínas/química , Proteínas/metabolismo , Programas Informáticos , Animales , Interpretación Estadística de Datos , Bases de Datos de Proteínas , Humanos , Internet , Espectrometría de Masas , Ratones , Péptidos/fisiología , Proteínas/fisiología , Proteómica , Ratas , Integración de Sistemas , Interfaz Usuario-Computador

14.

Exposing the co-adaptive potential of protein-protein interfaces through computational sequence design.

Fromer, Menachem; Linial, Michal.

Bioinformatics ; 26(18): 2266-72, 2010 Sep 15.

Artículo en Inglés | MEDLINE | ID: mdl-20679332

RESUMEN

MOTIVATION: In nature, protein-protein interactions are constantly evolving under various selective pressures. Nonetheless, it is expected that crucial interactions are maintained through compensatory mutations between interacting proteins. Thus, many studies have used evolutionary sequence data to extract such occurrences of correlated mutation. However, this research is confounded by other evolutionary pressures that contribute to sequence covariance, such as common ancestry. RESULTS: Here, we focus exclusively on the compensatory mutations deriving from physical protein interactions, by performing large-scale computational mutagenesis experiments for >260 protein-protein interfaces. We investigate the potential for co-adaptability present in protein pairs that are always found together in nature (obligate) and those that are occasionally in complex (transient). By modeling each complex both in bound and unbound forms, we find that naturally transient complexes possess greater relative capacity for correlated mutation than obligate complexes, even when differences in interface size are taken into account.

Asunto(s)

Mutación , Proteínas/química , Adaptación Biológica , Secuencia de Bases , Biología Computacional , Evolución Molecular , Unión Proteica/genética , Proteínas/genética

15.

SPRINT: side-chain prediction inference toolbox for multistate protein design.

Fromer, Menachem; Yanover, Chen; Harel, Amir; Shachar, Ori; Weiss, Yair; Linial, Michal.

Bioinformatics ; 26(19): 2466-7, 2010 Oct 01.

Artículo en Inglés | MEDLINE | ID: mdl-20685957

RESUMEN

UNLABELLED: SPRINT is a software package that performs computational multistate protein design using state-of-the-art inference on probabilistic graphical models. The input to SPRINT is a list of protein structures, the rotamers modeled for each structure and the pre-calculated rotamer energies. Probabilistic inference is performed using the belief propagation or A* algorithms, and dead-end elimination can be applied as pre-processing. The output can either be a list of amino acid sequences simultaneously compatible with these structures, or probabilistic amino acid profiles compatible with the structures. In addition, higher order (e.g. pairwise) amino acid probabilities can also be predicted. Finally, SPRINT also has a module for protein side-chain prediction and single-state design. AVAILABILITY: The full C++ source code for SPRINT can be freely downloaded from http://www.protonet.cs.huji.ac.il/sprint.

Asunto(s)

Algoritmos , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Aminoácidos/química , Modelos Estadísticos , Conformación Proteica

16.

A holistic approach for suppression of COVID-19 spread in workplaces and universities.

Poole, Sarah F; Gronsbell, Jessica; Winter, Dale; Nickels, Stefanie; Levy, Roie; Fu, Bin; Burq, Maximilien; Saeb, Sohrab; Edwards, Matthew D; Behr, Michael K; Kumaresan, Vignesh; Macalalad, Alexander R; Shah, Sneh; Prevost, Michelle; Snoad, Nigel; Brenner, Michael P; Myers, Lance J; Varghese, Paul; Califf, Robert M; Washington, Vindell; Lee, Vivian S; Fromer, Menachem.

PLoS One ; 16(8): e0254798, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-34383766

RESUMEN

As society has moved past the initial phase of the COVID-19 crisis that relied on broad-spectrum shutdowns as a stopgap method, industries and institutions have faced the daunting question of how to return to a stabilized state of activities and more fully reopen the economy. A core problem is how to return people to their workplaces and educational institutions in a manner that is safe, ethical, grounded in science, and takes into account the unique factors and needs of each organization and community. In this paper, we introduce an epidemiological model (the "Community-Workplace" model) that accounts for SARS-CoV-2 transmission within the workplace, within the surrounding community, and between them. We use this multi-group deterministic compartmental model to consider various testing strategies that, together with symptom screening, exposure tracking, and nonpharmaceutical interventions (NPI) such as mask wearing and physical distancing, aim to reduce disease spread in the workplace. Our framework is designed to be adaptable to a variety of specific workplace environments to support planning efforts as reopenings continue. Using this model, we consider a number of case studies, including an office workplace, a factory floor, and a university campus. Analysis of these cases illustrates that continuous testing can help a workplace avoid an outbreak by reducing undetected infectiousness even in high-contact environments. We find that a university setting, where individuals spend more time on campus and have a higher contact load, requires more testing to remain safe, compared to a factory or office setting. Under the modeling assumptions, we find that maintaining a prevalence below 3% can be achieved in an office setting by testing its workforce every two weeks, whereas achieving this same goal for a university could require as much as fourfold more testing (i.e., testing the entire campus population twice a week). Our model also simulates the dynamics of reduced spread that result from the introduction of mitigation measures when test results reveal the early stages of a workplace outbreak. We use this to show that a vigilant university that has the ability to quickly react to outbreaks can be justified in implementing testing at the same rate as a lower-risk office workplace. Finally, we quantify the devastating impact that an outbreak in a small-town college could have on the surrounding community, which supports the notion that communities can be better protected by supporting their local places of business in preventing onsite spread of disease.

Asunto(s)

COVID-19/prevención & control , Trazado de Contacto/métodos , Brotes de Enfermedades/prevención & control , Distanciamiento Físico , Universidades , Lugar de Trabajo , Humanos

17.

Toward a Mobile Platform for Real-world Digital Measurement of Depression: User-Centered Design, Data Quality, and Behavioral and Clinical Modeling.

Nickels, Stefanie; Edwards, Matthew D; Poole, Sarah F; Winter, Dale; Gronsbell, Jessica; Rozenkrants, Bella; Miller, David P; Fleck, Mathias; McLean, Alan; Peterson, Bret; Chen, Yuanwei; Hwang, Alan; Rust-Smith, David; Brant, Arthur; Campbell, Andrew; Chen, Chen; Walter, Collin; Arean, Patricia A; Hsin, Honor; Myers, Lance J; Marks, William J; Mega, Jessica L; Schlosser, Danielle A; Conrad, Andrew J; Califf, Robert M; Fromer, Menachem.

JMIR Ment Health ; 8(8): e27589, 2021 Aug 10.

Artículo en Inglés | MEDLINE | ID: mdl-34383685

RESUMEN

BACKGROUND: Although effective mental health treatments exist, the ability to match individuals to optimal treatments is poor, and timely assessment of response is difficult. One reason for these challenges is the lack of objective measurement of psychiatric symptoms. Sensors and active tasks recorded by smartphones provide a low-burden, low-cost, and scalable way to capture real-world data from patients that could augment clinical decision-making and move the field of mental health closer to measurement-based care. OBJECTIVE: This study tests the feasibility of a fully remote study on individuals with self-reported depression using an Android-based smartphone app to collect subjective and objective measures associated with depression severity. The goals of this pilot study are to develop an engaging user interface for high task adherence through user-centered design; test the quality of collected data from passive sensors; start building clinically relevant behavioral measures (features) from passive sensors and active inputs; and preliminarily explore connections between these features and depression severity. METHODS: A total of 600 participants were asked to download the study app to join this fully remote, observational 12-week study. The app passively collected 20 sensor data streams (eg, ambient audio level, location, and inertial measurement units), and participants were asked to complete daily survey tasks, weekly voice diaries, and the clinically validated Patient Health Questionnaire (PHQ-9) self-survey. Pairwise correlations between derived behavioral features (eg, weekly minutes spent at home) and PHQ-9 were computed. Using these behavioral features, we also constructed an elastic net penalized multivariate logistic regression model predicting depressed versus nondepressed PHQ-9 scores (ie, dichotomized PHQ-9). RESULTS: A total of 415 individuals logged into the app. Over the course of the 12-week study, these participants completed 83.35% (4151/4980) of the PHQ-9s. Applying data sufficiency rules for minimally necessary daily and weekly data resulted in 3779 participant-weeks of data across 384 participants. Using a subset of 34 behavioral features, we found that 11 features showed a significant (P<.001 Benjamini-Hochberg adjusted) Spearman correlation with weekly PHQ-9, including voice diary-derived word sentiment and ambient audio levels. Restricting the data to those cases in which all 34 behavioral features were present, we had available 1013 participant-weeks from 186 participants. The logistic regression model predicting depression status resulted in a 10-fold cross-validated mean area under the curve of 0.656 (SD 0.079). CONCLUSIONS: This study finds a strong proof of concept for the use of a smartphone-based assessment of depression outcomes. Behavioral features derived from passive sensors and active tasks show promising correlations with a validated clinical measure of depression (PHQ-9). Future work is needed to increase scale that may permit the construction of more complex (eg, nonlinear) predictive models and better handle data missingness.

18.

Design of multispecific protein sequences using probabilistic graphical modeling.

Fromer, Menachem; Yanover, Chen; Linial, Michal.

Proteins ; 78(3): 530-47, 2010 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-19842166

RESUMEN

In nature, proteins partake in numerous protein- protein interactions that mediate their functions. Moreover, proteins have been shown to be physically stable in multiple structures, induced by cellular conditions, small ligands, or covalent modifications. Understanding how protein sequences achieve this structural promiscuity at the atomic level is a fundamental step in the drug design pipeline and a critical question in protein physics. One way to investigate this subject is to computationally predict protein sequences that are compatible with multiple states, i.e., multiple target structures or binding to distinct partners. The goal of engineering such proteins has been termed multispecific protein design. We develop a novel computational framework to efficiently and accurately perform multispecific protein design. This framework utilizes recent advances in probabilistic graphical modeling to predict sequences with low energies in multiple target states. Furthermore, it is also geared to specifically yield positional amino acid probability profiles compatible with these target states. Such profiles can be used as input to randomly bias high-throughput experimental sequence screening techniques, such as phage display, thus providing an alternative avenue for elucidating the multispecificity of natural proteins and the synthesis of novel proteins with specific functionalities. We prove the utility of such multispecific design techniques in better recovering amino acid sequence diversities similar to those resulting from millions of years of evolution. We then compare the approaches of prediction of low energy ensembles and of amino acid profiles and demonstrate their complementarity in providing more robust predictions for protein design.

Asunto(s)

Secuencia de Aminoácidos , Biología Computacional/métodos , Modelos Químicos , Modelos Estadísticos , Proteínas/química , Algoritmos , Evolución Molecular , Modelos Biológicos , Modelos Moleculares , Datos de Secuencia Molecular , Receptores Activados del Proliferador del Peroxisoma/química , Receptores Activados del Proliferador del Peroxisoma/genética , Proteínas/genética , Relación Estructura-Actividad , Temperatura , Tiorredoxinas/química , Tiorredoxinas/genética , Transducina/química , Transducina/genética

19.

Viral adaptation to host: a proteome-based analysis of codon usage and amino acid preferences.

Bahir, Iris; Fromer, Menachem; Prat, Yosef; Linial, Michal.

Mol Syst Biol ; 5: 311, 2009.

Artículo en Inglés | MEDLINE | ID: mdl-19888206

RESUMEN

Viruses differ markedly in their specificity toward host organisms. Here, we test the level of general sequence adaptation that viruses display toward their hosts. We compiled a representative data set of viruses that infect hosts ranging from bacteria to humans. We consider their respective amino acid and codon usages and compare them among the viruses and their hosts. We show that bacteria-infecting viruses are strongly adapted to their specific hosts, but that they differ from other unrelated bacterial hosts. Viruses that infect humans, but not those that infect other mammals or aves, show a strong resemblance to most mammalian and avian hosts, in terms of both amino acid and codon preferences. In groups of viruses that infect humans or other mammals, the highest observed level of adaptation of viral proteins to host codon usages is for those proteins that appear abundantly in the virion. In contrast, proteins that are known to participate in host-specific recognition do not necessarily adapt to their respective hosts. The implication for the potential of viral infectivity is discussed.

Asunto(s)

Adaptación Fisiológica , Aminoácidos/metabolismo , Codón/genética , Interacciones Huésped-Patógeno/fisiología , Proteoma/metabolismo , Proteínas Virales/metabolismo , Fenómenos Fisiológicos de los Virus , Aminoácidos/genética , Animales , Composición de Base/genética , Sesgo , Humanos , Proteoma/genética , Proteínas Virales/genética , Proteínas Estructurales Virales/genética , Proteínas Estructurales Virales/metabolismo

20.

Tradeoff between stability and multispecificity in the design of promiscuous proteins.

Fromer, Menachem; Shifman, Julia M.

PLoS Comput Biol ; 5(12): e1000627, 2009 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-20041208

RESUMEN

Natural proteins often partake in several highly specific protein-protein interactions. They are thus subject to multiple opposing forces during evolutionary selection. To be functional, such multispecific proteins need to be stable in complex with each interaction partner, and, at the same time, to maintain affinity toward all partners. How is this multispecificity acquired through natural evolution? To answer this compelling question, we study a prototypical multispecific protein, calmodulin (CaM), which has evolved to interact with hundreds of target proteins. Starting from high-resolution structures of sixteen CaM-target complexes, we employ state-of-the-art computational methods to predict a hundred CaM sequences best suited for interaction with each individual CaM target. Then, we design CaM sequences most compatible with each possible combination of two, three, and all sixteen targets simultaneously, producing almost 70,000 low energy CaM sequences. By comparing these sequences and their energies, we gain insight into how nature has managed to find the compromise between the need for favorable interaction energies and the need for multispecificity. We observe that designing for more partners simultaneously yields CaM sequences that better match natural sequence profiles, thus emphasizing the importance of such strategies in nature. Furthermore, we show that the CaM binding interface can be nicely partitioned into positions that are critical for the affinity of all CaM-target complexes and those that are molded to provide interaction specificity. We reveal several basic categories of sequence-level tradeoffs that enable the compromise necessary for the promiscuity of this protein. We also thoroughly quantify the tradeoff between interaction energetics and multispecificity and find that facilitating seemingly competing interactions requires only a small deviation from optimal energies. We conclude that multispecific proteins have been subjected to a rigorous optimization process that has fine-tuned their sequences for interactions with a precise set of targets, thus conferring their multiple cellular functions.

Asunto(s)

Calmodulina/química , Diseño de Fármacos , Modelos Químicos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Sitios de Unión , Simulación por Computador , Datos de Secuencia Molecular , Unión Proteica

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA