Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 40(2)2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38364309

RESUMEN

MOTIVATION: Estimating the individual inbreeding coefficient and pairwise kinship is an important problem in human genetics (e.g. in disease mapping) and in animal and plant genetics (e.g. inbreeding design). Existing methods, such as sample correlation-based genetic relationship matrix, KING, and UKin, are either biased, or not able to estimate inbreeding coefficients, or produce a large proportion of negative estimates that are difficult to interpret. This limitation of existing methods is partly due to failure to explicitly model inbreeding. Since all humans are inbred to various degrees by virtue of shared ancestries, it is prudent to account for inbreeding when inferring kinship between individuals. RESULTS: We present "Kindred," an approach that estimates inbreeding and kinship by modeling latent identity-by-descent states that accounts for all possible allele sharing-including inbreeding-between two individuals. Kindred used non-negative least squares method to fit the model, which not only increases computation efficiency compared to the maximum likelihood method, but also guarantees non-negativity of the kinship estimates. Through simulation, we demonstrate the high accuracy and non-negativity of kinship estimates by Kindred. By selecting a subset of SNPs that are similar in allele frequencies across different continental populations, Kindred can accurately estimate kinship between admixed samples. In addition, we demonstrate that the realized kinship matrix estimated by Kindred is effective in reducing genomic control values via linear mixed model in genome-wide association studies. Finally, we demonstrate that Kindred produces sensible heritability estimates on an Australian height dataset. AVAILABILITY AND IMPLEMENTATION: Kindred is implemented in C with multi-threading. It takes vcf file or stream as input and works seamlessly with bcftools. Kindred is freely available at https://github.com/haplotype/kindred.


Asunto(s)
Estudio de Asociación del Genoma Completo , Endogamia , Animales , Humanos , Australia , Genoma , Frecuencia de los Genes , Linaje
2.
Res Sq ; 2023 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-37333260

RESUMEN

Genome-wide DNA methylation studies have typically focused on quantitative assessments of CpG methylation at individual loci. Although methylation states at nearby CpG sites are known to be highly correlated, suggestive of an underlying coordinated regulatory network, the extent and consistency of inter-CpG methylation correlation across the genome, including variation between individuals, disease states, and tissues, remains unknown. Here, we leverage image conversion of correlation matrices to identify correlated methylation units (CMUs) across the genome, describe their variation across tissues, and annotate their regulatory potential using 35 public Illumina BeadChip datasets spanning more than 12,000 individuals and 26 different tissues. We identified a median of 18,125 CMUs genome-wide, occurring on all chromosomes and spanning a median of ~1 kb. Notably, 50% of CMUs had evidence of long-range correlation with other proximal CMUs. Although the size and number of CMUs varied across datasets, we observed strong intra-tissue consistency among CMUs, with those in testis encompassing those seen in most other tissues. Approximately 20% of CMUs were highly conserved across normal tissues (i.e. tissue independent), with 73 loci demonstrating strong correlation with non-adjacent CMUs on the same chromosome. These loci were enriched for CTCF and transcription factor binding sites, always found within putative TADs, and associated with the B compartment of chromosome folding. Finally, we observed significantly different, but highly consistent, patterns of CMU correlation between diseased and non-diseased states. Our first-generation, genome-wide, DNA methylation map suggests a highly coordinated CMU regulatory network that is sensitive to disruptions in its architecture.

3.
Stat Methods Med Res ; 31(2): 315-333, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34931910

RESUMEN

Cocaine addiction is an important public health problem worldwide. Cognitive-behavioral therapy is a counseling intervention for supporting cocaine-dependent individuals through recovery and relapse prevention. It may reduce patients' cocaine uses by improving their motivations and enabling them to recognize risky situations. To study the effect of cognitive behavioral therapy on cocaine dependence, the self-reported cocaine use with urine test data were collected at the Primary Care Center of Yale-New Haven Hospital. Its outcomes are binary, including both the daily self-reported drug uses and weekly urine test results. To date, the generalized estimating equations are widely used to analyze binary data with repeated measures. However, due to the existence of significant self-report bias in the self-reported cocaine use with urine test data, a direct application of the generalized estimating equations approach may not be valid. In this paper, we proposed a novel mean corrected generalized estimating equations approach for analyzing longitudinal binary outcomes subject to reporting bias. The mean corrected generalized estimating equations can provide consistently and asymptotically normally distributed estimators under true contamination probabilities. In the self-reported cocaine use with urine test study, accurate weekly urine test results are used to detect contamination. The superior performances of the proposed method are illustrated by both simulation studies and real data analysis.


Asunto(s)
Cocaína , Proyectos de Investigación , Sesgo , Simulación por Computador , Humanos , Autoinforme
4.
Nat Commun ; 12(1): 510, 2021 01 21.
Artículo en Inglés | MEDLINE | ID: mdl-33479230

RESUMEN

Accurate pathogenicity prediction of missense variants is critically important in genetic studies and clinical diagnosis. Previously published prediction methods have facilitated the interpretation of missense variants but have limited performance. Here, we describe MVP (Missense Variant Pathogenicity prediction), a new prediction method that uses deep residual network to leverage large training data sets and many correlated predictors. We train the model separately in genes that are intolerant of loss of function variants and the ones that are tolerant in order to take account of potentially different genetic effect size and mode of action. We compile cancer mutation hotspots and de novo variants from developmental disorders for benchmarking. Overall, MVP achieves better performance in prioritizing pathogenic missense variants than previous methods, especially in genes tolerant of loss of function variants. Finally, using MVP, we estimate that de novo coding variants contribute to 7.8% of isolated congenital heart disease, nearly doubling previous estimates.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Profundo , Predisposición Genética a la Enfermedad/genética , Mutación Missense , Neoplasias/genética , Algoritmos , Trastorno del Espectro Autista/diagnóstico , Trastorno del Espectro Autista/genética , Cardiopatías Congénitas/diagnóstico , Cardiopatías Congénitas/genética , Humanos , Neoplasias/diagnóstico , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
5.
Genome Res ; 30(9): 1364-1375, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32883749

RESUMEN

We present Nubeam (nucleotide be a matrix) as a novel reference-free approach to analyze short sequencing reads. Nubeam represents nucleotides by matrices, transforms a read into a product of matrices, and assigns numbers to reads based on the product matrix. Nubeam capitalizes on the noncommutative property of matrix multiplication, such that different reads are assigned different numbers and similar reads similar numbers. A sample, which is a collection of reads, becomes a collection of numbers that form an empirical distribution. We demonstrate that the genetic difference between samples can be quantified by the distance between empirical distributions. Nubeam includes the k-mer method as a special case, but unlike the k-mer method, it is convenient for Nubeam to account for GC bias and nucleotide quality. As a reference-free approach, Nubeam avoids reference bias and mapping bias, and can work with organisms without reference genomes. Thus, Nubeam is ideal to analyze data sets from metagenomics whole genome shotgun (WGS) sequencing, where the amount of unmapped reads is substantial. When applied to a WGS sequencing data set to quantify distances between metagenomics samples from various human body habitats, Nubeam recapitulates findings made by mapping-based methods and sheds light on contributions of unmapped reads. Nubeam is also useful in analyzing 16S rRNA sequencing data, which is a more prevalent type of data set in metagenomics studies. In our analysis, Nubeam recapitulated the findings that natural microbiota in mouse gut are resilient under challenges, and Nubeam detected differences in vaginal microbiota between cases of polycystic ovary syndrome and healthy controls.


Asunto(s)
Metagenómica/métodos , Secuenciación Completa del Genoma/métodos , Animales , Femenino , Microbioma Gastrointestinal , Humanos , Ratones , ARN Ribosómico 16S , Análisis de Secuencia de ARN/métodos , Vagina/microbiología
6.
Bioinformatics ; 36(10): 3254-3256, 2020 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-32091581

RESUMEN

SUMMARY: We present Nubeam-dedup, a fast and RAM-efficient tool to de-duplicate sequencing reads without reference genome. Nubeam-dedup represents nucleotides by matrices, transforms reads into products of matrices, and based on which assigns a unique number to a read. Thus, duplicate reads can be efficiently removed by using a collisionless hash function. Compared with other state-of-the-art reference-free tools, Nubeam-dedup uses 50-70% of CPU time and 10-15% of RAM. AVAILABILITY AND IMPLEMENTATION: Source code in C++ and manual are available at https://github.com/daihang16/nubeamdedup and https://haplotype.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Algoritmos , Genoma , Análisis de Secuencia de ADN
7.
Genet Med ; 22(2): 301-308, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31467446

RESUMEN

PURPOSE: Fetal fraction (FF) is the percent of cell-free DNA (cfDNA) in the mother's peripheral blood that is of fetal origin, which plays a pivotal role in noninvasive prenatal screening (NIPS). We present a method that can reliably estimate FFs by examining autosome single-nucleotide polymorphisms (SNPs). METHODS: Even at a very low sequencing depth, there are plenty of SNPs covered by more than one read. At those SNPs, we define read heterozygosity and demonstrate that the percent of read heterozygosity is a function of FF, which allows FF to be inferred. RESULTS: We first demonstrated the effectiveness of our method in inferring FF. Then we used the inferred FF as an informative alternative prior to computing Bayes factors to test for aneuploidy, and observed better power than the Z-test. In analysis of clinical samples, we were able to identify female-male twins thanks to the accurate FF inference. CONCLUSION: Knowing FF improves efficacy of NIPS. It brings a powerful Bayesian method, allows "no call" for samples with small FFs, renders screening for XXY syndrome simpler, and permits an adaptive design to sequence at a higher depth for samples with small FFs.


Asunto(s)
Ácidos Nucleicos Libres de Células/análisis , Desarrollo Fetal/genética , Pruebas Prenatales no Invasivas/métodos , Aberraciones Cromosómicas , Femenino , Feto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Polimorfismo de Nucleótido Simple/genética , Embarazo , Atención Prenatal , Diagnóstico Prenatal/métodos , Análisis de Secuencia de ADN/métodos
8.
Genet Med ; 22(2): 450, 2020 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31822850

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

9.
Nat Commun ; 10(1): 5791, 2019 12 19.
Artículo en Inglés | MEDLINE | ID: mdl-31857576

RESUMEN

Edematous severe acute childhood malnutrition (edematous SAM or ESAM), which includes kwashiorkor, presents with more overt multi-organ dysfunction than non-edematous SAM (NESAM). Reduced concentrations and methyl-flux of methionine in 1-carbon metabolism have been reported in acute, but not recovered, ESAM, suggesting downstream DNA methylation changes could be relevant to differences in SAM pathogenesis. Here, we assess genome-wide DNA methylation in buccal cells of 309 SAM children using the 450 K microarray. Relative to NESAM, ESAM is characterized by multiple significantly hypomethylated loci, which is not observed among SAM-recovered adults. Gene expression and methylation show both positive and negative correlation, suggesting a complex transcriptional response to SAM. Hypomethylated loci link to disorders of nutrition and metabolism, including fatty liver and diabetes, and appear to be influenced by genetic variation. Our epigenetic findings provide a potential molecular link to reported aberrant 1-carbon metabolism in ESAM and support consideration of methyl-group supplementation in ESAM.


Asunto(s)
Metilación de ADN , Epigenoma/genética , Desnutrición Aguda Severa/genética , Adolescente , Adulto , Estudios de Casos y Controles , Preescolar , Islas de CpG/genética , Epigenómica/métodos , Femenino , Perfilación de la Expresión Génica , Humanos , Lactante , Jamaica/epidemiología , Malaui/epidemiología , Masculino , Mucosa Bucal , Estudios Prospectivos , Estudios Retrospectivos , Desnutrición Aguda Severa/mortalidad , Sobrevivientes , Adulto Joven
10.
Bayesian Anal ; 14(2): 573-594, 2019 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31608133

RESUMEN

Bayesian variable selection regression (BVSR) is able to jointly analyze genome-wide genetic datasets, but the slow computation via Markov chain Monte Carlo (MCMC) hampered its wide-spread usage. Here we present a novel iterative method to solve a special class of linear systems, which can increase the speed of the BVSR model-fitting tenfold. The iterative method hinges on the complex factorization of the sum of two matrices and the solution path resides in the complex domain (instead of the real domain). Compared to the Gauss-Seidel method, the complex factorization converges almost instantaneously and its error is several magnitude smaller than that of the Gauss-Seidel method. More importantly, the error is always within the pre-specified precision while the Gauss-Seidel method is not. For large problems with thousands of covariates, the complex factorization is 10-100 times faster than either the Gauss-Seidel method or the direct method via the Cholesky decomposition. In BVSR, one needs to repetitively solve large penalized regression systems whose design matrices only change slightly between adjacent MCMC steps. This slight change in design matrix enables the adaptation of the iterative complex factorization method. The computational innovation will facilitate the wide-spread use of BVSR in reanalyzing genome-wide association datasets.

11.
Blood Adv ; 2(24): 3637-3647, 2018 12 26.
Artículo en Inglés | MEDLINE | ID: mdl-30578281

RESUMEN

Red blood cell (RBC) transfusion remains a critical therapeutic intervention in sickle cell disease (SCD); however, the apparent propensity of some patients to regularly develop RBC alloantibodies after transfusion presents a significant challenge to finding compatible blood for so-called alloimmunization responders. Predisposing genetic loci have long been thought to contribute to the responder phenomenon, but to date, no definitive loci have been identified. We undertook a genome-wide association study of alloimmunization responder status in 267 SCD multiple transfusion recipients, using genetic estimates of ancestral admixture to bolster our findings. Analyses revealed single nucleotide polymorphisms (SNPs) on chromosomes 2 and 5 approaching genome-wide significance (minimum P = 2.0 × 10-8 and 8.4 × 10-8, respectively), with local ancestry analysis demonstrating similar levels of admixture in responders and nonresponders at implicated loci. Association at chromosome 5 was nominally replicated in an independent cohort of 130 SCD transfusion recipients, with meta-analysis surpassing genome-wide significance (rs75853687, P meta = 6.6 × 10-9), and this extended to individuals forming multiple (>3) alloantibodies (P meta = 9.4 × 10-5). The associated variant is rare outside of African populations, and orthogonal genome-wide haplotype analyses, contingent on local ancestry, revealed genome-wide significant sharing of a ∼60-kb haplotype of African ancestry at the chromosome 5 locus (Bayes Factor = 4.95). This locus overlaps a putative cis-acting enhancer predicted to regulate transcription of ADRA1B and the lncRNA LINC01847, both members of larger ontologies associated with immune regulation. Our findings provide potential insights to the pathophysiology underlying the development of alloantibodies and implicate non-RBC ancestry-limited loci in the susceptibility to alloimmunization.


Asunto(s)
Anemia de Células Falciformes/patología , Negro o Afroamericano/genética , Cromosomas Humanos Par 5/genética , Isoanticuerpos/sangre , Alelos , Anemia de Células Falciformes/genética , Anemia de Células Falciformes/inmunología , Cromosomas Humanos Par 2/genética , Sitios Genéticos , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos , Humanos , Polimorfismo de Nucleótido Simple , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , Receptores Adrenérgicos alfa 1/genética , Receptores Adrenérgicos alfa 1/metabolismo
12.
J Am Stat Assoc ; 113(523): 1362-1371, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30386004

RESUMEN

We show that under the null, the 2 log(Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and the normal prior. Our results have three immediate impacts. First, we can compute analytically a p-value associated with a Bayes factor without the need of permutation. We provide a software package that can evaluate the p-value associated with Bayes factor efficiently and accurately. Second, the null distribution is illuminating to some intrinsic properties of Bayes factor, namely, how Bayes factor quantitatively depends on prior and the genesis of Bartlett's paradox. Third, enlightened by the null distribution of Bayes factor, we formulate a novel scaled Bayes factor that depends less on the prior and is immune to Bartlett's paradox. When two tests have an identical p-value, the test with a larger power tends to have a larger scaled Bayes factor, a desirable property that is missing for the (unscaled) Bayes factor.

13.
J Theor Biol ; 455: 342-356, 2018 10 14.
Artículo en Inglés | MEDLINE | ID: mdl-30053386

RESUMEN

Chikungunya, dengue, and Zika viruses are all transmitted by Aedes aegypti and Aedes albopictus mosquito species, had been imported to Florida and caused local outbreaks. We propose a deterministic model to study the importation and local transmission of these mosquito-borne diseases. The purpose is to model and mimic the importation of these viruses to Florida via travelers, local infections in domestic mosquitoes by imported travelers, and finally non-travel related transmissions to local humans by infected local mosquitoes. As a case study, the model will be used to simulate the accumulative Zika cases in Florida. Since the disease system is driven by a continuing input of infections from outside sources, orthodox analytic methods based on the calculation of the basic reproduction number are inadequate to describe and predict their behavior. Via steady-state analysis and sensitivity analysis, effective control and prevention measures for these mosquito-borne diseases are tested.


Asunto(s)
Aedes/virología , Brotes de Enfermedades , Modelos Biológicos , Mosquitos Vectores/virología , Infección por el Virus Zika , Virus Zika , Animales , Fiebre Chikungunya/epidemiología , Fiebre Chikungunya/transmisión , Virus Chikungunya , Dengue/epidemiología , Dengue/transmisión , Virus del Dengue , Florida/epidemiología , Humanos , Infección por el Virus Zika/epidemiología , Infección por el Virus Zika/transmisión
14.
Genet Med ; 20(8): 817-824, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29120459

RESUMEN

PURPOSE: Noninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction. METHOD: Our Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values. RESULTS: Our Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives. CONCLUSION: Bayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.


Asunto(s)
Teorema de Bayes , Diagnóstico Prenatal/métodos , Análisis de Secuencia de ADN/métodos , Adulto , China , Femenino , Feto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Cadenas de Markov , Embarazo , Atención Prenatal
15.
Clin Rheumatol ; 36(8): 1819-1826, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28432524

RESUMEN

The studies aimed to assess a set of biomarkers for their correlations with disease activity/severity of patients with ankylosing spondylitis (AS). A total of 24 AS patients were treated with etanercept and prospectively followed for 12 weeks. Serum levels of TNF-α, IFN-γ, TGF-ß, IL6, IL15, IL17, MMP3, and MICA were measured at baseline and after treatment. The change of these biomarkers was analyzed for correlations with MRI indices for joint inflammation, Bath Ankylosing Spondylitis Disease Activity Index, Bath Ankylosing Spondylitis Functional Index, AS Disease Activity Score, serum CRP, and ESR. The Wilcoxon rank sum test was used to compare the biomarker levels between pre- and post-treatment and between pre-treatment and controls. Both step-wise procedures based on the Akaike information criterion (AIC) and least absolute shrinkage and selection operator with fivefold cross-validation were used to select the best model for pairwise correlations between the above clinical measures and the serum biomarkers. Serum levels of both MMP3 and IL6 were significantly higher in AS patients at baseline. After treatment, the levels of MMP3 decreased, but TGF-ß and TNF-α increased significantly. The changes of serum MMP3 and MICA were significantly associated with MRI sacroiliac joint (SIJ) scores. CRP was positively correlated with serum MMP3 and IL6. The pattern of combined changes of serum MICA, MMP3, TGF-ß, IL17, TNF-α, and IFN-γ predicted the MRI score of SIJ by logistic regression analysis. Specific serum biomarkers were significantly associated with clinical measures of AS. Most prominently, serum MMP3 level was found to have a positive correlation with the MRI score of SIJ and CRP. Serum MICA level negatively correlated with disease remission.


Asunto(s)
Antirreumáticos/uso terapéutico , Etanercept/uso terapéutico , Metaloproteinasa 3 de la Matriz/sangre , Espondilitis Anquilosante/sangre , Espondilitis Anquilosante/tratamiento farmacológico , Adulto , Biomarcadores/sangre , Proteína C-Reactiva/análisis , Citocinas/sangre , Femenino , Humanos , Imagen por Resonancia Magnética , Masculino , Persona de Mediana Edad , Proyectos Piloto , Estudios Prospectivos , Articulación Sacroiliaca/diagnóstico por imagen , Índice de Severidad de la Enfermedad , Espondilitis Anquilosante/diagnóstico por imagen , Resultado del Tratamiento , Adulto Joven
16.
Biometrics ; 73(4): 1311-1320, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-28369699

RESUMEN

Applications of spatial point processes for large and complex data sets with inhomogeneities as encountered, example, in tropical rain forest ecology call for estimation methods that are both statistically and computationally efficient. We propose a novel second-order quasi-likelihood procedure to estimate the parameters for a second-order intensity reweighted stationary spatial point process. Our approach is to derive first- and second-order estimating functions and then combine them linearly using appropriate weight functions. In the stationary case, we argue that the asymptotically optimal weight functions are respectively a constant and a function of lags between distinct locations in the observation window. This leads to a considerable gain in computational efficiency. We further exploit this simplification in the nonstationary case. Simulations show that, when compared with several existing approaches, our method can achieve significant gains in statistical efficiency. An application to a tropical rain forest data set further illustrates the advantages of our procedure.


Asunto(s)
Biometría , Ecología , Modelos Estadísticos , Algoritmos , Simulación por Computador , Bosque Lluvioso
17.
Nat Commun ; 7: 12065, 2016 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-27356984

RESUMEN

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.


Asunto(s)
Pueblo Asiatico/genética , Genoma Humano , Variación Estructural del Genoma , Humanos , Masculino , Análisis de Secuencia de ADN , Análisis de Secuencia de ARN , Transcriptoma
18.
Stat Med ; 35(24): 4306-4319, 2016 10 30.
Artículo en Inglés | MEDLINE | ID: mdl-27241902

RESUMEN

Recurrent event data are quite common in biomedical and epidemiological studies. A significant portion of these data also contain additional longitudinal information on surrogate markers. Previous studies have shown that popular methods using a Cox model with longitudinal outcomes as time-dependent covariates may lead to biased results, especially when longitudinal outcomes are measured with error. Hence, it is important to incorporate longitudinal information into the analysis properly. To achieve this, we model the correlation between longitudinal and recurrent event processes using latent random effect terms. We then propose a two-stage conditional estimating equation approach to model the rate function of recurrent event process conditioned on the observed longitudinal information. The performance of our proposed approach is evaluated through simulation. We also apply the approach to analyze cocaine addiction data collected by the University of Connecticut Health Center. The data include recurrent event information on cocaine relapse and longitudinal cocaine craving scores. Copyright © 2016 John Wiley & Sons, Ltd.


Asunto(s)
Exactitud de los Datos , Estudios Longitudinales , Trastornos Relacionados con Cocaína , Humanos , Recurrencia
19.
PLoS Genet ; 12(2): e1005847, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26863142

RESUMEN

Mexicans are a recent admixture of Amerindians, Europeans, and Africans. We performed local ancestry analysis of Mexican samples from two genome-wide association studies obtained from dbGaP, and discovered that at the MHC region Mexicans have excessive African ancestral alleles compared to the rest of the genome, which is the hallmark of recent selection for admixed samples. The estimated selection coefficients are 0.05 and 0.07 for two datasets, which put our finding among the strongest known selections observed in humans, namely, lactase selection in northern Europeans and sickle-cell trait in Africans. Using inaccurate Amerindian training samples was a major concern for the credibility of previously reported selection signals in Latinos. Taking advantage of the flexibility of our statistical model, we devised a model fitting technique that can learn Amerindian ancestral haplotype from the admixed samples, which allows us to infer local ancestries for Mexicans using only European and African training samples. The strong selection signal at the MHC remains without Amerindian training samples. Finally, we note that medical history studies suggest such a strong selection at MHC is plausible in Mexicans.


Asunto(s)
Pool de Genes , Complejo Mayor de Histocompatibilidad/genética , Selección Genética , Población Negra/genética , Dosificación de Gen , Genealogía y Heráldica , Humanos , México , Análisis de Componente Principal , Población Blanca/genética
20.
Stat Med ; 35(14): 2422-40, 2016 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-26790617

RESUMEN

Spatiotemporal calibration of output from deterministic models is an increasingly popular tool to more accurately and efficiently estimate the true distribution of spatial and temporal processes. Current calibration techniques have focused on a single source of data on observed measurements of the process of interest that are both temporally and spatially dense. Additionally, these methods often calibrate deterministic models available in grid-cell format with pixel sizes small enough that the centroid of the pixel closely approximates the measurement for other points within the pixel. We develop a modeling strategy that allows us to simultaneously incorporate information from two sources of data on observed measurements of the process (that differ in their spatial and temporal resolutions) to calibrate estimates from a deterministic model available on a regular grid. This method not only improves estimates of the pollutant at the grid centroids but also refines the spatial resolution of the grid data. The modeling strategy is illustrated by calibrating and spatially refining daily estimates of ambient nitrogen dioxide concentration over Connecticut for 1994 from the Community Multiscale Air Quality model (temporally dense grid-cell estimates on a large pixel size) using observations from an epidemiologic study (spatially dense and temporally sparse) and Environmental Protection Agency monitoring stations (temporally dense and spatially sparse). Copyright © 2016 John Wiley & Sons, Ltd.


Asunto(s)
Modelos Estadísticos , Análisis Espacio-Temporal , Contaminantes Atmosféricos/análisis , Contaminación del Aire/análisis , Contaminación del Aire/estadística & datos numéricos , Bioestadística , Calibración , Connecticut , Exposición a Riesgos Ambientales/análisis , Exposición a Riesgos Ambientales/estadística & datos numéricos , Monitoreo del Ambiente/estadística & datos numéricos , Humanos , Dióxido de Nitrógeno/análisis , Estados Unidos , United States Environmental Protection Agency
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...