Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 40(2)2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38364309

RESUMEN

MOTIVATION: Estimating the individual inbreeding coefficient and pairwise kinship is an important problem in human genetics (e.g. in disease mapping) and in animal and plant genetics (e.g. inbreeding design). Existing methods, such as sample correlation-based genetic relationship matrix, KING, and UKin, are either biased, or not able to estimate inbreeding coefficients, or produce a large proportion of negative estimates that are difficult to interpret. This limitation of existing methods is partly due to failure to explicitly model inbreeding. Since all humans are inbred to various degrees by virtue of shared ancestries, it is prudent to account for inbreeding when inferring kinship between individuals. RESULTS: We present "Kindred," an approach that estimates inbreeding and kinship by modeling latent identity-by-descent states that accounts for all possible allele sharing-including inbreeding-between two individuals. Kindred used non-negative least squares method to fit the model, which not only increases computation efficiency compared to the maximum likelihood method, but also guarantees non-negativity of the kinship estimates. Through simulation, we demonstrate the high accuracy and non-negativity of kinship estimates by Kindred. By selecting a subset of SNPs that are similar in allele frequencies across different continental populations, Kindred can accurately estimate kinship between admixed samples. In addition, we demonstrate that the realized kinship matrix estimated by Kindred is effective in reducing genomic control values via linear mixed model in genome-wide association studies. Finally, we demonstrate that Kindred produces sensible heritability estimates on an Australian height dataset. AVAILABILITY AND IMPLEMENTATION: Kindred is implemented in C with multi-threading. It takes vcf file or stream as input and works seamlessly with bcftools. Kindred is freely available at https://github.com/haplotype/kindred.


Asunto(s)
Estudio de Asociación del Genoma Completo , Endogamia , Animales , Humanos , Australia , Genoma , Frecuencia de los Genes , Linaje
2.
Genome Res ; 30(9): 1364-1375, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32883749

RESUMEN

We present Nubeam (nucleotide be a matrix) as a novel reference-free approach to analyze short sequencing reads. Nubeam represents nucleotides by matrices, transforms a read into a product of matrices, and assigns numbers to reads based on the product matrix. Nubeam capitalizes on the noncommutative property of matrix multiplication, such that different reads are assigned different numbers and similar reads similar numbers. A sample, which is a collection of reads, becomes a collection of numbers that form an empirical distribution. We demonstrate that the genetic difference between samples can be quantified by the distance between empirical distributions. Nubeam includes the k-mer method as a special case, but unlike the k-mer method, it is convenient for Nubeam to account for GC bias and nucleotide quality. As a reference-free approach, Nubeam avoids reference bias and mapping bias, and can work with organisms without reference genomes. Thus, Nubeam is ideal to analyze data sets from metagenomics whole genome shotgun (WGS) sequencing, where the amount of unmapped reads is substantial. When applied to a WGS sequencing data set to quantify distances between metagenomics samples from various human body habitats, Nubeam recapitulates findings made by mapping-based methods and sheds light on contributions of unmapped reads. Nubeam is also useful in analyzing 16S rRNA sequencing data, which is a more prevalent type of data set in metagenomics studies. In our analysis, Nubeam recapitulated the findings that natural microbiota in mouse gut are resilient under challenges, and Nubeam detected differences in vaginal microbiota between cases of polycystic ovary syndrome and healthy controls.


Asunto(s)
Metagenómica/métodos , Secuenciación Completa del Genoma/métodos , Animales , Femenino , Microbioma Gastrointestinal , Humanos , Ratones , ARN Ribosómico 16S , Análisis de Secuencia de ARN/métodos , Vagina/microbiología
3.
Bioinformatics ; 36(10): 3254-3256, 2020 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-32091581

RESUMEN

SUMMARY: We present Nubeam-dedup, a fast and RAM-efficient tool to de-duplicate sequencing reads without reference genome. Nubeam-dedup represents nucleotides by matrices, transforms reads into products of matrices, and based on which assigns a unique number to a read. Thus, duplicate reads can be efficiently removed by using a collisionless hash function. Compared with other state-of-the-art reference-free tools, Nubeam-dedup uses 50-70% of CPU time and 10-15% of RAM. AVAILABILITY AND IMPLEMENTATION: Source code in C++ and manual are available at https://github.com/daihang16/nubeamdedup and https://haplotype.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Algoritmos , Genoma , Análisis de Secuencia de ADN
4.
Genet Med ; 22(2): 450, 2020 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31822850

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

5.
Genet Med ; 22(2): 301-308, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31467446

RESUMEN

PURPOSE: Fetal fraction (FF) is the percent of cell-free DNA (cfDNA) in the mother's peripheral blood that is of fetal origin, which plays a pivotal role in noninvasive prenatal screening (NIPS). We present a method that can reliably estimate FFs by examining autosome single-nucleotide polymorphisms (SNPs). METHODS: Even at a very low sequencing depth, there are plenty of SNPs covered by more than one read. At those SNPs, we define read heterozygosity and demonstrate that the percent of read heterozygosity is a function of FF, which allows FF to be inferred. RESULTS: We first demonstrated the effectiveness of our method in inferring FF. Then we used the inferred FF as an informative alternative prior to computing Bayes factors to test for aneuploidy, and observed better power than the Z-test. In analysis of clinical samples, we were able to identify female-male twins thanks to the accurate FF inference. CONCLUSION: Knowing FF improves efficacy of NIPS. It brings a powerful Bayesian method, allows "no call" for samples with small FFs, renders screening for XXY syndrome simpler, and permits an adaptive design to sequence at a higher depth for samples with small FFs.


Asunto(s)
Ácidos Nucleicos Libres de Células/análisis , Desarrollo Fetal/genética , Pruebas Prenatales no Invasivas/métodos , Aberraciones Cromosómicas , Femenino , Feto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Polimorfismo de Nucleótido Simple/genética , Embarazo , Atención Prenatal , Diagnóstico Prenatal/métodos , Análisis de Secuencia de ADN/métodos
6.
PLoS Genet ; 12(2): e1005847, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26863142

RESUMEN

Mexicans are a recent admixture of Amerindians, Europeans, and Africans. We performed local ancestry analysis of Mexican samples from two genome-wide association studies obtained from dbGaP, and discovered that at the MHC region Mexicans have excessive African ancestral alleles compared to the rest of the genome, which is the hallmark of recent selection for admixed samples. The estimated selection coefficients are 0.05 and 0.07 for two datasets, which put our finding among the strongest known selections observed in humans, namely, lactase selection in northern Europeans and sickle-cell trait in Africans. Using inaccurate Amerindian training samples was a major concern for the credibility of previously reported selection signals in Latinos. Taking advantage of the flexibility of our statistical model, we devised a model fitting technique that can learn Amerindian ancestral haplotype from the admixed samples, which allows us to infer local ancestries for Mexicans using only European and African training samples. The strong selection signal at the MHC remains without Amerindian training samples. Finally, we note that medical history studies suggest such a strong selection at MHC is plausible in Mexicans.


Asunto(s)
Pool de Genes , Complejo Mayor de Histocompatibilidad/genética , Selección Genética , Población Negra/genética , Dosificación de Gen , Genealogía y Heráldica , Humanos , México , Análisis de Componente Principal , Población Blanca/genética
7.
Genet Med ; 20(8): 817-824, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29120459

RESUMEN

PURPOSE: Noninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction. METHOD: Our Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values. RESULTS: Our Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives. CONCLUSION: Bayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.


Asunto(s)
Teorema de Bayes , Diagnóstico Prenatal/métodos , Análisis de Secuencia de ADN/métodos , Adulto , China , Femenino , Feto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Cadenas de Markov , Embarazo , Atención Prenatal
8.
J Theor Biol ; 455: 342-356, 2018 10 14.
Artículo en Inglés | MEDLINE | ID: mdl-30053386

RESUMEN

Chikungunya, dengue, and Zika viruses are all transmitted by Aedes aegypti and Aedes albopictus mosquito species, had been imported to Florida and caused local outbreaks. We propose a deterministic model to study the importation and local transmission of these mosquito-borne diseases. The purpose is to model and mimic the importation of these viruses to Florida via travelers, local infections in domestic mosquitoes by imported travelers, and finally non-travel related transmissions to local humans by infected local mosquitoes. As a case study, the model will be used to simulate the accumulative Zika cases in Florida. Since the disease system is driven by a continuing input of infections from outside sources, orthodox analytic methods based on the calculation of the basic reproduction number are inadequate to describe and predict their behavior. Via steady-state analysis and sensitivity analysis, effective control and prevention measures for these mosquito-borne diseases are tested.


Asunto(s)
Aedes/virología , Brotes de Enfermedades , Modelos Biológicos , Mosquitos Vectores/virología , Infección por el Virus Zika , Virus Zika , Animales , Fiebre Chikungunya/epidemiología , Fiebre Chikungunya/transmisión , Virus Chikungunya , Dengue/epidemiología , Dengue/transmisión , Virus del Dengue , Florida/epidemiología , Humanos , Infección por el Virus Zika/epidemiología , Infección por el Virus Zika/transmisión
9.
Biometrics ; 73(4): 1311-1320, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-28369699

RESUMEN

Applications of spatial point processes for large and complex data sets with inhomogeneities as encountered, example, in tropical rain forest ecology call for estimation methods that are both statistically and computationally efficient. We propose a novel second-order quasi-likelihood procedure to estimate the parameters for a second-order intensity reweighted stationary spatial point process. Our approach is to derive first- and second-order estimating functions and then combine them linearly using appropriate weight functions. In the stationary case, we argue that the asymptotically optimal weight functions are respectively a constant and a function of lags between distinct locations in the observation window. This leads to a considerable gain in computational efficiency. We further exploit this simplification in the nonstationary case. Simulations show that, when compared with several existing approaches, our method can achieve significant gains in statistical efficiency. An application to a tropical rain forest data set further illustrates the advantages of our procedure.


Asunto(s)
Biometría , Ecología , Modelos Estadísticos , Algoritmos , Simulación por Computador , Bosque Lluvioso
10.
Stat Med ; 35(24): 4306-4319, 2016 10 30.
Artículo en Inglés | MEDLINE | ID: mdl-27241902

RESUMEN

Recurrent event data are quite common in biomedical and epidemiological studies. A significant portion of these data also contain additional longitudinal information on surrogate markers. Previous studies have shown that popular methods using a Cox model with longitudinal outcomes as time-dependent covariates may lead to biased results, especially when longitudinal outcomes are measured with error. Hence, it is important to incorporate longitudinal information into the analysis properly. To achieve this, we model the correlation between longitudinal and recurrent event processes using latent random effect terms. We then propose a two-stage conditional estimating equation approach to model the rate function of recurrent event process conditioned on the observed longitudinal information. The performance of our proposed approach is evaluated through simulation. We also apply the approach to analyze cocaine addiction data collected by the University of Connecticut Health Center. The data include recurrent event information on cocaine relapse and longitudinal cocaine craving scores. Copyright © 2016 John Wiley & Sons, Ltd.


Asunto(s)
Exactitud de los Datos , Estudios Longitudinales , Trastornos Relacionados con Cocaína , Humanos , Recurrencia
11.
Stat Med ; 35(14): 2422-40, 2016 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-26790617

RESUMEN

Spatiotemporal calibration of output from deterministic models is an increasingly popular tool to more accurately and efficiently estimate the true distribution of spatial and temporal processes. Current calibration techniques have focused on a single source of data on observed measurements of the process of interest that are both temporally and spatially dense. Additionally, these methods often calibrate deterministic models available in grid-cell format with pixel sizes small enough that the centroid of the pixel closely approximates the measurement for other points within the pixel. We develop a modeling strategy that allows us to simultaneously incorporate information from two sources of data on observed measurements of the process (that differ in their spatial and temporal resolutions) to calibrate estimates from a deterministic model available on a regular grid. This method not only improves estimates of the pollutant at the grid centroids but also refines the spatial resolution of the grid data. The modeling strategy is illustrated by calibrating and spatially refining daily estimates of ambient nitrogen dioxide concentration over Connecticut for 1994 from the Community Multiscale Air Quality model (temporally dense grid-cell estimates on a large pixel size) using observations from an epidemiologic study (spatially dense and temporally sparse) and Environmental Protection Agency monitoring stations (temporally dense and spatially sparse). Copyright © 2016 John Wiley & Sons, Ltd.


Asunto(s)
Modelos Estadísticos , Análisis Espacio-Temporal , Contaminantes Atmosféricos/análisis , Contaminación del Aire/análisis , Contaminación del Aire/estadística & datos numéricos , Bioestadística , Calibración , Connecticut , Exposición a Riesgos Ambientales/análisis , Exposición a Riesgos Ambientales/estadística & datos numéricos , Monitoreo del Ambiente/estadística & datos numéricos , Humanos , Dióxido de Nitrógeno/análisis , Estados Unidos , United States Environmental Protection Agency
12.
Biometrics ; 71(4): 1022-33, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26102478

RESUMEN

We introduce a new multivariate product-shot-noise Cox process which is useful for modeling multi-species spatial point patterns with clustering intra-specific interactions and neutral, negative, or positive inter-specific interactions. The auto- and cross-pair correlation functions of the process can be obtained in closed analytical forms and approximate simulation of the process is straightforward. We use the proposed process to model interactions within and among five tree species in the Barro Colorado Island plot.


Asunto(s)
Modelos de Riesgos Proporcionales , Biometría/métodos , Ecosistema , Modelos Lineales , Modelos Biológicos , Modelos Estadísticos , Análisis Multivariante , Distribución Normal , Especificidad de la Especie , Árboles
13.
Biometrics ; 71(1): 114-121, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25351292

RESUMEN

We propose a novel statistical framework by supplementing case-control data with summary statistics on the population at risk for a subset of risk factors. Our approach is to first form two unbiased estimating equations, one based on the case-control data and the other on both the case data and the summary statistics, and then optimally combine them to derive another estimating equation to be used for the estimation. The proposed method is computationally simple and more efficient than standard approaches based on case-control data alone. We also establish asymptotic properties of the resulting estimator, and investigate its finite-sample performance through simulation. As a substantive application, we apply the proposed method to investigate risk factors for endometrial cancer, by using data from a recently completed population-based case-control study and summary statistics from the Behavioral Risk Factor Surveillance System, the Population Estimates Program of the US Census Bureau, and the Connecticut Department of Transportation.


Asunto(s)
Algoritmos , Estudios de Casos y Controles , Interpretación Estadística de Datos , Neoplasias Endometriales/epidemiología , Modelos Estadísticos , Medición de Riesgo/métodos , Simulación por Computador , Métodos Epidemiológicos , Femenino , Humanos , Almacenamiento y Recuperación de la Información/métodos , Prevalencia
14.
Res Sq ; 2023 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-37333260

RESUMEN

Genome-wide DNA methylation studies have typically focused on quantitative assessments of CpG methylation at individual loci. Although methylation states at nearby CpG sites are known to be highly correlated, suggestive of an underlying coordinated regulatory network, the extent and consistency of inter-CpG methylation correlation across the genome, including variation between individuals, disease states, and tissues, remains unknown. Here, we leverage image conversion of correlation matrices to identify correlated methylation units (CMUs) across the genome, describe their variation across tissues, and annotate their regulatory potential using 35 public Illumina BeadChip datasets spanning more than 12,000 individuals and 26 different tissues. We identified a median of 18,125 CMUs genome-wide, occurring on all chromosomes and spanning a median of ~1 kb. Notably, 50% of CMUs had evidence of long-range correlation with other proximal CMUs. Although the size and number of CMUs varied across datasets, we observed strong intra-tissue consistency among CMUs, with those in testis encompassing those seen in most other tissues. Approximately 20% of CMUs were highly conserved across normal tissues (i.e. tissue independent), with 73 loci demonstrating strong correlation with non-adjacent CMUs on the same chromosome. These loci were enriched for CTCF and transcription factor binding sites, always found within putative TADs, and associated with the B compartment of chromosome folding. Finally, we observed significantly different, but highly consistent, patterns of CMU correlation between diseased and non-diseased states. Our first-generation, genome-wide, DNA methylation map suggests a highly coordinated CMU regulatory network that is sensitive to disruptions in its architecture.

15.
Stat Methods Med Res ; 31(2): 315-333, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34931910

RESUMEN

Cocaine addiction is an important public health problem worldwide. Cognitive-behavioral therapy is a counseling intervention for supporting cocaine-dependent individuals through recovery and relapse prevention. It may reduce patients' cocaine uses by improving their motivations and enabling them to recognize risky situations. To study the effect of cognitive behavioral therapy on cocaine dependence, the self-reported cocaine use with urine test data were collected at the Primary Care Center of Yale-New Haven Hospital. Its outcomes are binary, including both the daily self-reported drug uses and weekly urine test results. To date, the generalized estimating equations are widely used to analyze binary data with repeated measures. However, due to the existence of significant self-report bias in the self-reported cocaine use with urine test data, a direct application of the generalized estimating equations approach may not be valid. In this paper, we proposed a novel mean corrected generalized estimating equations approach for analyzing longitudinal binary outcomes subject to reporting bias. The mean corrected generalized estimating equations can provide consistently and asymptotically normally distributed estimators under true contamination probabilities. In the self-reported cocaine use with urine test study, accurate weekly urine test results are used to detect contamination. The superior performances of the proposed method are illustrated by both simulation studies and real data analysis.


Asunto(s)
Cocaína , Proyectos de Investigación , Sesgo , Simulación por Computador , Humanos , Autoinforme
16.
Am J Hum Genet ; 82(5): 1193-201, 2008 May.
Artículo en Inglés | MEDLINE | ID: mdl-18439552

RESUMEN

Data from the Pharmacogenomics and Risk of Cardiovascular Disease (PARC) study and the Cardiovascular Health Study (CHS) provide independent and confirmatory evidence for association between common polymorphisms of the HNF1A gene encoding hepatocyte nuclear factor-1 alpha and plasma C-reactive protein (CRP) concentration. Analyses with the use of imputation-based methods to combine genotype data from both studies and to test untyped SNPs from the HapMap database identified several SNPs within a 5 kb region of HNF1A intron 1 with the strongest evidence of association with CRP phenotype.


Asunto(s)
Proteína C-Reactiva/genética , Factor Nuclear 1-alfa del Hepatocito/genética , Anciano , Teorema de Bayes , Femenino , Humanos , Inhibidores de Hidroximetilglutaril-CoA Reductasas/uso terapéutico , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Pravastatina/uso terapéutico , Simvastatina/uso terapéutico
17.
Biometrics ; 67(3): 926-36, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21133879

RESUMEN

We introduce novel regression extrapolation based methods to correct the often large bias in subsampling variance estimation as well as hypothesis testing for spatial point and marked point processes. For variance estimation, our proposed estimators are linear combinations of the usual subsampling variance estimator based on subblock sizes in a continuous interval. We show that they can achieve better rates in mean squared error than the usual subsampling variance estimator. In particular, for n×n observation windows, the optimal rate of n(-2) can be achieved if the data have a finite dependence range. For hypothesis testing, we apply the proposed regression extrapolation directly to the test statistics based on different subblock sizes, and therefore avoid the need to conduct bias correction for each element in the covariance matrix used to set up the test statistics. We assess the numerical performance of the proposed methods through simulation, and apply them to analyze a tropical forest data set.


Asunto(s)
Sesgo , Análisis de Regresión , Simulación por Computador , Humanos , Métodos
18.
Biometrics ; 67(3): 730-9, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21361885

RESUMEN

A typical recurrent event dataset consists of an often large number of recurrent event processes, each of which contains multiple event times observed from an individual during a follow-up period. Such data have become increasingly available in medical and epidemiological studies. In this article, we introduce novel procedures to conduct second-order analysis for a flexible class of semiparametric recurrent event processes. Such an analysis can provide useful information regarding the dependence structure within each recurrent event process. Specifically, we will use the proposed procedures to test whether the individual recurrent event processes are all Poisson processes and to suggest sensible alternative models for them if they are not. We apply these procedures to a well-known recurrent event dataset on chronic granulomatous disease and an epidemiological dataset on meningococcal disease cases in Merseyside, United Kingdom to illustrate their practical value.


Asunto(s)
Interpretación Estadística de Datos , Modelos Estadísticos , Recurrencia , Biometría/métodos , Estudios Epidemiológicos , Enfermedad Granulomatosa Crónica/patología , Humanos , Infecciones Meningocócicas/epidemiología , Reino Unido
19.
Biometrics ; 67(3): 711-8, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21361887

RESUMEN

This article is concerned with variance estimation for statistics that are computed from single recurrent event processes. Such statistics are important in diagnosis for each individual recurrent event process. The proposed method only assumes a semiparametric form for the first-order structure of the processes but not for the second-order (i.e., dependence) structure. The new variance estimator is shown to be consistent for the target parameter under very mild conditions. The estimator can be used in many applications in semiparametric rate regression analysis of recurrent event data such as outlier detection, residual diagnosis, as well as robust regression. A simulation study and application to two real data examples are used to demonstrate the use of the proposed method.


Asunto(s)
Biometría/métodos , Modelos Estadísticos , Recurrencia , Análisis de Varianza , Análisis de Regresión
20.
PLoS Genet ; 4(12): e1000279, 2008 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19057666

RESUMEN

Imputation-based association methods provide a powerful framework for testing untyped variants for association with phenotypes and for combining results from multiple studies that use different genotyping platforms. Here, we consider several issues that arise when applying these methods in practice, including: (i) factors affecting imputation accuracy, including choice of reference panel; (ii) the effects of imputation accuracy on power to detect associations; (iii) the relative merits of Bayesian and frequentist approaches to testing imputed genotypes for association with phenotype; and (iv) how to quickly and accurately compute Bayes factors for testing imputed SNPs. We find that imputation-based methods can be robust to imputation accuracy and can improve power to detect associations, even when average imputation accuracy is poor. We explain how ranking SNPs for association by a standard likelihood ratio test gives the same results as a Bayesian procedure that uses an unnatural prior assumption--specifically, that difficult-to-impute SNPs tend to have larger effects--and assess the power gained from using a Bayesian approach that does not make this assumption. Within the Bayesian framework, we find that good approximations to a full analysis can be achieved by simply replacing unknown genotypes with a point estimate--their posterior mean. This approximation considerably reduces computational expense compared with published sampling-based approaches, and the methods we present are practical on a genome-wide scale with very modest computational resources (e.g., a single desktop computer). The approximation also facilitates combining information across studies, using only summary data for each SNP. Methods discussed here are implemented in the software package BIMBAM, which is available from http://stephenslab.uchicago.edu/software.html.


Asunto(s)
Técnicas Genéticas/normas , Estudio de Asociación del Genoma Completo/normas , Cómputos Matemáticos , Bases de Datos Genéticas , Genotipo , Humanos , Modelos Genéticos , Modelos Estadísticos , Fenotipo , Polimorfismo de Nucleótido Simple , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA