Búsqueda | Portal Regional de la BVS

1.

Systematic analysis of human antibody response to ebolavirus glycoprotein shows high prevalence of neutralizing public clonotypes.

Chen, Elaine C; Gilchuk, Pavlo; Zost, Seth J; Ilinykh, Philipp A; Binshtein, Elad; Huang, Kai; Myers, Luke; Bonissone, Stefano; Day, Samuel; Kona, Chandrahaas R; Trivette, Andrew; Reidy, Joseph X; Sutton, Rachel E; Gainza, Christopher; Diaz, Summer; Williams, Jazmean K; Selverian, Christopher N; Davidson, Edgar; Saphire, Erica Ollmann; Doranz, Benjamin J; Castellana, Natalie; Bukreyev, Alexander; Carnahan, Robert H; Crowe, James E.

Cell Rep ; 42(4): 112370, 2023 04 25.

Artículo en Inglés | MEDLINE | ID: mdl-37029928

RESUMEN

Understanding the human antibody response to emerging viral pathogens is key to epidemic preparedness. As the size of the B cell response to a pathogenic-virus-protective antigen is poorly defined, we perform deep paired heavy- and light-chain sequencing in Ebola virus glycoprotein (EBOV-GP)-specific memory B cells, allowing analysis of the ebolavirus-specific antibody repertoire both genetically and functionally. This approach facilitates investigation of the molecular and genetic basis for the evolution of cross-reactive antibodies by elucidating germline-encoded properties of antibodies to EBOV and identification of the overlap between antibodies in the memory B cell and serum repertoire. We identify 73 public clonotypes of EBOV, 20% of which encode antibodies with neutralization activity and capacity to protect mice in vivo. This comprehensive analysis of the public and private antibody repertoire provides insight into the molecular basis of the humoral immune response to EBOV GP, which informs the design of vaccines and improved therapeutics.

Asunto(s)

Ebolavirus , Fiebre Hemorrágica Ebola , Humanos , Animales , Ratones , Anticuerpos Neutralizantes , Anticuerpos Antivirales , Formación de Anticuerpos , Prevalencia , Glicoproteínas/genética

2.

De novo sequencing and construction of a unique antibody for the recognition of alternative conformations of cytochrome c in cells.

Tomasina, Florencia; Martínez, Jennyfer; Zeida, Ari; Chiribao, María Laura; Demicheli, Verónica; Correa, Agustín; Quijano, Celia; Castro, Laura; Carnahan, Robert H; Vinson, Paige; Goff, Matt; Cooper, Tracy; McDonald, W Hayes; Castellana, Natalie; Hannibal, Luciana; Morse, Paul T; Wan, Junmei; Hüttemann, Maik; Jemmerson, Ronald; Piacenza, Lucía; Radi, Rafael.

Proc Natl Acad Sci U S A ; 119(47): e2213432119, 2022 11 22.

Artículo en Inglés | MEDLINE | ID: mdl-36378644

RESUMEN

Cytochrome c (cyt c) can undergo reversible conformational changes under biologically relevant conditions. Revealing these alternative cyt c conformers at the cell and tissue level is challenging. A monoclonal antibody (mAb) identifying a key conformational change in cyt c was previously reported, but the hybridoma was rendered nonviable. To resurrect the mAb in a recombinant form, the amino-acid sequences of the heavy and light chains were determined by peptide mapping-mass spectrometry-bioinformatic analysis and used to construct plasmids encoding the full-length chains. The recombinant mAb (R1D3) was shown to perform similarly to the original mAb in antigen-binding assays. The mAb bound to a variety of oxidatively modified cyt c species (e.g., nitrated at Tyr74 or oxidized at Met80), which lose the sixth heme ligation (Fe-Met80); it did not bind to several cyt c phospho- and acetyl-mimetics. Peptide competition assays together with molecular dynamic studies support that R1D3 binds a neoepitope within the loop 40-57. R1D3 was employed to identify alternative conformations of cyt c in cells under oxidant- or senescence-induced challenge as confirmed by immunocytochemistry and immunoaffinity studies. Alternative conformers translocated to the nuclei without causing apoptosis, an observation that was further confirmed after pinocytic loading of oxidatively modified cyt c to B16-F1 cells. Thus, alternative cyt c conformers, known to gain peroxidatic function, may represent redox messengers at the cell nuclei. The availability and properties of R1D3 open avenues of interrogation regarding the presence and biological functions of alternative conformations of cyt c in mammalian cells and tissues.

Asunto(s)

Citocromos c , Hemo , Animales , Secuencia de Aminoácidos , Anticuerpos Monoclonales , Citocromos c/química , Hemo/química , Hibridomas , Oxidación-Reducción , Melanoma Experimental , Ratones

3.

INDI-integrated nanobody database for immunoinformatics.

Deszynski, Piotr; Mlokosiewicz, Jakub; Volanakis, Adam; Jaszczyszyn, Igor; Castellana, Natalie; Bonissone, Stefano; Ganesan, Rajkumar; Krawczyk, Konrad.

Nucleic Acids Res ; 50(D1): D1273-D1281, 2022 01 07.

Artículo en Inglés | MEDLINE | ID: mdl-34747487

RESUMEN

Nanobodies, a subclass of antibodies found in camelids, are versatile molecular binding scaffolds composed of a single polypeptide chain. The small size of nanobodies bestows multiple therapeutic advantages (stability, tumor penetration) with the first therapeutic approval in 2018 cementing the clinical viability of this format. Structured data and sequence information of nanobodies will enable the accelerated clinical development of nanobody-based therapeutics. Though the nanobody sequence and structure data are deposited in the public domain at an accelerating pace, the heterogeneity of sources and lack of standardization hampers reliable harvesting of nanobody information. We address this issue by creating the Integrated Database of Nanobodies for Immunoinformatics (INDI, http://naturalantibody.com/nanobodies). INDI collates nanobodies from all the major public outlets of biological sequences: patents, GenBank, next-generation sequencing repositories, structures and scientific publications. We equip INDI with powerful nanobody-specific sequence and text search facilitating access to >11 million nanobody sequences. INDI should facilitate development of novel nanobody-specific computational protocols helping to deliver on the therapeutic promise of this drug format.

Asunto(s)

Camelidae/inmunología , Bases de Datos Genéticas , Neoplasias/terapia , Anticuerpos de Dominio Único/inmunología , Secuencia de Aminoácidos/genética , Animales , Anticuerpos/clasificación , Anticuerpos/inmunología , Camelidae/clasificación , Humanos , Inmunoterapia/clasificación , Neoplasias/inmunología , Anticuerpos de Dominio Único/clasificación

4.

Proteo-Genomic Analysis Identifies Two Major Sites of Vulnerability on Ebolavirus Glycoprotein for Neutralizing Antibodies in Convalescent Human Plasma.

Gilchuk, Pavlo; Guthals, Adrian; Bonissone, Stefano R; Shaw, Jared B; Ilinykh, Philipp A; Huang, Kai; Bombardi, Robin G; Liang, Jenny; Grinyo, Ariadna; Davidson, Edgar; Chen, Elaine C; Gunn, Bronwyn M; Alter, Galit; Saphire, Erica Ollmann; Doranz, Benjamin J; Bukreyev, Alexander; Zeitlin, Larry; Castellana, Natalie; Crowe, James E.

Front Immunol ; 12: 706757, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-34335620

RESUMEN

Three clinically relevant ebolaviruses - Ebola (EBOV), Bundibugyo (BDBV), and Sudan (SUDV) viruses, are responsible for severe disease and occasional deadly outbreaks in Africa. The largest Ebola virus disease (EVD) epidemic to date in 2013-2016 in West Africa highlighted the urgent need for countermeasures, leading to the development and FDA approval of the Ebola virus vaccine rVSV-ZEBOV (Ervebo®) in 2020 and two monoclonal antibody (mAb)-based therapeutics (Inmazeb® [atoltivimab, maftivimab, and odesivimab-ebgn] and Ebanga® (ansuvimab-zykl) in 2020. The humoral response plays an indispensable role in ebolavirus immunity, based on studies of mAbs isolated from the antibody genes in peripheral blood circulating ebolavirus-specific human memory B cells. However, antibodies in the body are not secreted by circulating memory B cells in the blood but rather principally by plasma cells in the bone marrow. Little is known about the protective polyclonal antibody responses in convalescent plasma. Here we exploited both single-cell antibody gene sequencing and proteomic sequencing approaches to assess the composition of the ebolavirus glycoprotein (GP)-reactive antibody repertoire in the plasma of an EVD survivor. We first identified 1,512 GP-specific mAb variable gene sequences from single cells in the memory B cell compartment. Using mass spectrometric analysis of the corresponding GP-specific plasma IgG, we found that only a portion of the large B cell antibody repertoire was represented in the plasma. Molecular and functional analysis of proteomics-identified mAbs revealed recognition of epitopes in three major antigenic sites - the GP head domain, the glycan cap, and the base region, with a high prevalence of neutralizing and protective mAb specificities that targeted the base and glycan cap regions on the GP. Polyclonal plasma antibodies from the survivor reacted broadly to EBOV, BDBV, and SUDV GP, while reactivity of the potently neutralizing mAbs we identified was limited mostly to the homologous EBOV GP. Together these results reveal a restricted diversity of neutralizing humoral response in which mAbs targeting two antigenic sites on GP - glycan cap and base - play a principal role in plasma-antibody-mediated protective immunity against EVD.

Asunto(s)

Anticuerpos Neutralizantes/inmunología , Anticuerpos Antivirales/inmunología , Antígenos Virales/inmunología , Ebolavirus/inmunología , Glicoproteínas de Membrana/inmunología , Adulto , Fiebre Hemorrágica Ebola/inmunología , Humanos , Masculino , Proteómica

5.

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

Castellana, Natalie E; Shen, Zhouxin; He, Yupeng; Walley, Justin W; Cassidy, California Jack; Briggs, Steven P; Bafna, Vineet.

Mol Cell Proteomics ; 13(1): 157-67, 2014 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-24142994

RESUMEN

New technologies in genomics and proteomics have influenced the emergence of proteogenomics, a field at the confluence of genomics, transcriptomics, and proteomics. First generation proteogenomic toolkits employ peptide mass spectrometry to identify novel protein coding regions. We extend first generation proteogenomic tools to achieve greater accuracy and enable the analysis of large, complex genomes. We apply our pipeline to Zea mays, which has a genome comparable in size to human. Our pipeline begins with the comparison of mass spectra to a putative translation of the genome. We select novel peptides, those that match a region of the genome that was not previously known to be protein coding, for grouping into refinement events. We present a novel, probabilistic framework for evaluating the accuracy of each event. Our calculated event probability, or eventProb, considers the number of supporting peptides and spectra, and the quality of each supporting peptide-spectrum match. Our pipeline predicts 165 novel protein-coding genes and proposes updated models for 741 additional genes.

Asunto(s)

Genómica , Proteómica , Zea mays/genética , Genoma de Planta , Humanos , Espectrometría de Masas , Sistemas de Lectura Abierta

6.

Proteogenomic database construction driven from large scale RNA-seq data.

Woo, Sunghee; Cha, Seong Won; Merrihew, Gennifer; He, Yupeng; Castellana, Natalie; Guest, Clark; MacCoss, Michael; Bafna, Vineet.

J Proteome Res ; 13(1): 21-8, 2014 Jan 03.

Artículo en Inglés | MEDLINE | ID: mdl-23802565

RESUMEN

The advent of inexpensive RNA-seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS-based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our paper addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads. Applying our method to cumulative C. elegans data reduced 496.2 GB of aligned RNA-seq SAM files to 410 MB of splice graph database written in FASTA format. This corresponds to 1000× compression of data size, without loss of sensitivity. We performed a proteogenomics study using the custom data set, using a completely automated pipeline, and identified a total of 4044 novel events, including 215 novel genes, 808 novel exons, 12 alternative splicings, 618 gene-boundary corrections, 245 exon-boundary changes, 938 frame shifts, 1166 reverse strands, and 42 translated UTRs. Our results highlight the usefulness of transcript + proteomic integration for improved genome annotations.

Asunto(s)

Caenorhabditis elegans/metabolismo , Bases de Datos Genéticas , Bases de Datos de Proteínas , Genoma , Proteoma , Análisis de Secuencia de ARN , Secuencia de Aminoácidos , Animales , Automatización , Caenorhabditis elegans/genética , Proteínas del Helminto/química , Proteínas del Helminto/genética , Proteínas del Helminto/metabolismo , Datos de Secuencia Molecular

7.

MORPH-PRO: a novel algorithm and web server for protein morphing.

Castellana, Natalie E; Lushnikov, Andrey; Rotkiewicz, Piotr; Sefcovic, Natasha; Pevzner, Pavel A; Godzik, Adam; Vyatkina, Kira.

Algorithms Mol Biol ; 8(1): 19, 2013 Jul 11.

Artículo en Inglés | MEDLINE | ID: mdl-23844614

RESUMEN

BACKGROUND: Proteins are known to be dynamic in nature, changing from one conformation to another while performing vital cellular tasks. It is important to understand these movements in order to better understand protein function. At the same time, experimental techniques provide us with only single snapshots of the whole ensemble of available conformations. Computational protein morphing provides a visualization of a protein structure transitioning from one conformation to another by producing a series of intermediate conformations. RESULTS: We present a novel, efficient morphing algorithm, Morph-Pro based on linear interpolation. We also show that apart from visualization, morphing can be used to provide plausible intermediate structures. We test this by using the intermediate structures of a c-Jun N-terminal kinase (JNK1) conformational change in a virtual docking experiment. The structures are shown to dock with higher score to known JNK1-binding ligands than structures solved using X-Ray crystallography. This experiment demonstrates the potential applications of the intermediate structures in modeling or virtual screening efforts. CONCLUSIONS: Visualization of protein conformational changes is important for characterization of protein function. Furthermore, the intermediate structures produced by our algorithm are good approximations to true structures. We believe there is great potential for these computationally predicted structures in protein-ligand docking experiments and virtual screening. The Morph-Pro web server can be accessed at http://morph-pro.bioinf.spbau.ru.

8.

Plant proteogenomics: from protein extraction to improved gene predictions.

Chapman, Brett; Castellana, Natalie; Apffel, Alex; Ghan, Ryan; Cramer, Grant R; Bellgard, Matthew; Haynes, Paul A; Van Sluyter, Steven C.

Methods Mol Biol ; 1002: 267-94, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23625410

RESUMEN

Historically many genome annotation strategies have lacked experimental evidence at the protein level, which and have instead relied heavily on ab initio gene prediction tools, which consequently resulted in many incorrectly annotated genomic sequences. Proteogenomics aims to address these issues using mass spectrometry (MS)-based proteomics, genomic mapping, and providing statistical significance measures such as false discovery rates (FDRs) to validate the mapped peptides. Presented here is a tool capable of meeting this goal, the UCSD proteogenomic pipeline, which maps peptide-spectrum matches (PSMs) to the genome using the Inspect MS/MS database search tool and assigns a statistical significance to the match using a target-decoy search approach to assign estimated FDRs. This pipeline also provides the option of using a more reliable approach to proteogenomics by determining the precise false-positive rates (FPRs) and p-values of each PSM by calculating their spectral probabilities and rescoring each PSM accordingly. In addition to the protein prediction challenges in the rapidly growing number of sequenced plant genomes, it is difficult to extract high-quality protein samples from many plant species. For that reason, this chapter contains methods for protein extraction and trypsin digestion that reliably produce samples suitable for proteogenomic analysis.

Asunto(s)

Mapeo Cromosómico , Mapeo Peptídico , Proteínas de Plantas/análisis , Proteínas de Plantas/genética , Plantas/química , Plantas/genética , Algoritmos , Cromatografía Liquida , Bases de Datos de Proteínas , Genoma de Planta , Genómica , Espectrometría de Masas , Péptidos/análisis , Péptidos/química , Proteómica , Motor de Búsqueda

9.

Resurrection of a clinical antibody: template proteogenomic de novo proteomic sequencing and reverse engineering of an anti-lymphotoxin-α antibody.

Castellana, Natalie E; McCutcheon, Krista; Pham, Victoria C; Harden, Kristin; Nguyen, Allen; Young, Judy; Adams, Camellia; Schroeder, Kurt; Arnott, David; Bafna, Vineet; Grogan, Jane L; Lill, Jennie R.

Proteomics ; 11(3): 395-405, 2011 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-21268269

RESUMEN

A mouse hybridoma antibody directed against a member of the tumour necrosis factor (TNF)-superfamily, lymphotoxin-alpha (LT-α), was isolated from stored mouse ascites and purified to homogeneity. After more than a decade of storage the genetic material was not available for cloning; however, biochemical assays with the ascites showed this antibody against LT-α (LT-3F12) to be a preclinical candidate for the treatment of several inflammatory pathologies. We have successfully rescued the LT-3F12 antibody by performing MS analysis, primary amino acid sequence determination by template proteogenomics, and synthesis of the corresponding recombinant DNA by reverse engineering. The resurrected antibody was expressed, purified and shown to demonstrate the desired specificity and binding properties in a panel of immuno-biochemical tests. The work described herein demonstrates the powerful combination of high-throughput informatic proteomic de novo sequencing with reverse engineering to reestablish monoclonal antibody-expressing cells from archived protein sample, exemplifying the development of novel therapeutics from cryptic protein sources.

Asunto(s)

Anticuerpos Antiidiotipos/metabolismo , Anticuerpos Monoclonales/metabolismo , Ingeniería Genética , Genómica , Linfotoxina-alfa/metabolismo , Proteómica , Proteínas Recombinantes/metabolismo , Secuencia de Aminoácidos , Animales , Anticuerpos Antiidiotipos/genética , Anticuerpos Antiidiotipos/inmunología , Anticuerpos Monoclonales/genética , Anticuerpos Monoclonales/inmunología , Células Cultivadas , Endotelio Vascular/citología , Endotelio Vascular/metabolismo , Hibridomas , Linfotoxina-alfa/genética , Linfotoxina-alfa/inmunología , Ratones , Datos de Secuencia Molecular , Proteínas Recombinantes/genética , Proteínas Recombinantes/inmunología , Homología de Secuencia de Aminoácido , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Venas Umbilicales/citología , Venas Umbilicales/metabolismo

10.

Proteogenomics to discover the full coding content of genomes: a computational perspective.

Castellana, Natalie; Bafna, Vineet.

J Proteomics ; 73(11): 2124-35, 2010 Oct 10.

Artículo en Inglés | MEDLINE | ID: mdl-20620248

RESUMEN

Proteogenomics has emerged as a field at the junction of genomics and proteomics. It is a loose collection of technologies that allow the search of tandem mass spectra against genomic databases to identify and characterize protein-coding genes. Proteogenomic peptides provide invaluable information for gene annotation, which is difficult or impossible to ascertain using standard annotation methods. Examples include confirmation of translation, reading-frame determination, identification of gene and exon boundaries, evidence for post-translational processing, identification of splice-forms including alternative splicing, and also, prediction of completely novel genes. For proteogenomics to deliver on its promise, however, it must overcome a number of technological hurdles, including speed and accuracy of peptide identification, construction and search of specialized databases, correction of sampling bias, and others. This article reviews the state of the art of the field, focusing on the current successes, and the role of computation in overcoming these challenges. We describe how technological and algorithmic advances have already enabled large-scale proteogenomic studies in many model organisms, including arabidopsis, yeast, fly, and human. We also provide a preview of the field going forward, describing early efforts in tackling the problems of complex gene structures, searching against genomes of related species, and immunoglobulin gene reconstruction.

Asunto(s)

Biología Computacional/métodos , Genoma/genética , Sistemas de Lectura Abierta/genética , Proteoma/genética , Animales , Biología Computacional/tendencias , Humanos , Proteómica/métodos , Proteómica/tendencias

11.

Template proteogenomics: sequencing whole proteins using an imperfect database.

Castellana, Natalie E; Pham, Victoria; Arnott, David; Lill, Jennie R; Bafna, Vineet.

Mol Cell Proteomics ; 9(6): 1260-70, 2010 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-20164058

RESUMEN

Database search algorithms are the primary workhorses for the identification of tandem mass spectra. However, these methods are limited to the identification of spectra for which peptides are present in the database, preventing the identification of peptides from mutated or alternatively spliced sequences. A variety of methods has been developed to search a spectrum against a sequence allowing for variations. Some tools determine the sequence of the homologous protein in the related species but do not report the peptide in the target organism. Other tools consider variations, including modifications and mutations, in reconstructing the target sequence. However, these tools will not work if the template (homologous peptide) is missing in the database, and they do not attempt to reconstruct the entire protein target sequence. De novo identification of peptide sequences is another possibility, because it does not require a protein database. However, the lack of database reduces the accuracy. We present a novel proteogenomic approach, GenoMS, that draws on the strengths of database and de novo peptide identification methods. Protein sequence templates (i.e. proteins or genomic sequences that are similar to the target protein) are identified using the database search tool InsPecT. The templates are then used to recruit, align, and de novo sequence regions of the target protein that have diverged from the database or are missing. We used GenoMS to reconstruct the full sequence of an antibody by using spectra acquired from multiple digests using different proteases. Antibodies are a prime example of proteins that confound standard database identification techniques. The mature antibody genes result from large-scale genome rearrangements with flexible fusion boundaries and somatic hypermutation. Using GenoMS we automatically reconstruct the complete sequences of two immunoglobulin chains with accuracy greater than 98% using a diverged protein database. Using the genome as the template, we achieve accuracy exceeding 97%.

Asunto(s)

Bases de Datos de Proteínas , Proteómica/métodos , Análisis de Secuencia de Proteína/métodos , Moldes Genéticos , Algoritmos , Secuencia de Aminoácidos , Animales , Inmunoglobulinas/biosíntesis , Inmunoglobulinas/química , Cadenas de Markov , Ratones , Receptores Inmunológicos/química , Receptores Inmunológicos/metabolismo , Alineación de Secuencia , Espectrometría de Masas en Tándem

12.

Discovery and revision of Arabidopsis genes by proteogenomics.

Castellana, Natalie E; Payne, Samuel H; Shen, Zhouxin; Stanke, Mario; Bafna, Vineet; Briggs, Steven P.

Proc Natl Acad Sci U S A ; 105(52): 21034-8, 2008 Dec 30.

Artículo en Inglés | MEDLINE | ID: mdl-19098097

RESUMEN

Gene annotation underpins genome science. Most often protein coding sequence is inferred from the genome based on transcript evidence and computational predictions. While generally correct, gene models suffer from errors in reading frame, exon border definition, and exon identification. To ascertain the error rate of Arabidopsis thaliana gene models, we isolated proteins from a sample of Arabidopsis tissues and determined the amino acid sequences of 144,079 distinct peptides by tandem mass spectrometry. The peptides corresponded to 1 or more of 3 different translations of the genome: a 6-frame translation, an exon splice-graph, and the currently annotated proteome. The majority of the peptides (126,055) resided in existing gene models (12,769 confirmed proteins), comprising 40% of annotated genes. Surprisingly, 18,024 novel peptides were found that do not correspond to annotated genes. Using the gene finding program AUGUSTUS and 5,426 novel peptides that occurred in clusters, we discovered 778 new protein-coding genes and refined the annotation of an additional 695 gene models. The remaining 13,449 novel peptides provide high quality annotation (>99% correct) for thousands of additional genes. Our observation that 18,024 of 144,079 peptides did not match current gene models suggests that 13% of the Arabidopsis proteome was incomplete due to approximately equal numbers of missing and incorrect gene models.

Asunto(s)

Proteínas de Arabidopsis/genética , Arabidopsis/genética , Genoma de Planta/genética , Proteoma/genética , Proteómica , Programas Informáticos , Modelos Genéticos , Proteómica/métodos

13.

Relaxing haplotype block models for association testing.

Castellana, Natalie; Dhamdhere, Kedar; Sridhar, Srinath; Schwartz, Russell.

Pac Symp Biocomput ; : 454-66, 2006.

Artículo en Inglés | MEDLINE | ID: mdl-17094260

RESUMEN

The arrival of publicly available genome-wide variation data is creating new opportunities for reconciling model-based methods for associating genotypes and phenotypes with the complexities of real genome data. Such data is particularly valuable for testing the utility of models of conserved haplotype structure to association studies. While there is much interest in "haplotype block" models that assume population-wide regions of low diversity, there is also evidence that such models eliminate correlations potentially useful to association studies. We investigate the value of relaxing the rigidity of block models by developing an association testing method using the previously developed "haplotype motif" model, which retains the notion of representing haploid sequences as concatenations of conserved haplotypes but abandons the assumption of population-wide block boundaries. We compare the effectiveness of motif, block, and single-variant models at finding association with simulated phenotypes using real and simulated data. We conclude that the benefits of haplotype models in any form are modest, but that haplotype models in general and block-free models in particular are useful in picking up correlations near the boundaries of the detectable level.

Asunto(s)

Haplotipos , Modelos Genéticos , Cromosomas Humanos Par 22/genética , Biología Computacional , Simulación por Computador , Bases de Datos Genéticas , Procesamiento Automatizado de Datos , Humanos , Escala de Lod , Polimorfismo de Nucleótido Simple

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA