Your browser doesn't support javascript.
loading
Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach.
Risso, Davide; Taglioli, Luca; De Iasio, Sergio; Gueresi, Paola; Alfani, Guido; Nelli, Sergio; Rossi, Paolo; Paoli, Giorgio; Tofanelli, Sergio.
Afiliación
  • Risso D; National Institute on Deafness and Other Communication Disorders, NIH, Bethesda, MD 20854, United States of America; Laboratory of Molecular Anthropology and Centre for Genome Biology, Department of BiGeA, University of Bologna, via Selmi 3, 40126 Bologna, Italy.
  • Taglioli L; Dipartimento di Biologia, University of Pisa, Via Ghini 13, 56126 Pisa, Italy.
  • De Iasio S; Dipartimento di Genetica Biologia dei Microrganismi Antropologia Evoluzione, University of Parma, Parco Area delle Scienze 11/a, 43124 Parma, Italy.
  • Gueresi P; Dipartimento di Scienze Statistiche, University of Bologna, Via Belle Arti 41, 40126 Bologna, Italy.
  • Alfani G; Bocconi University, Dondena Centre and IGIER, Milan, Italy.
  • Nelli S; Archivio di Stato, Lucca, Italy.
  • Rossi P; Dipartimento di Fisica, University of Pisa, Largo Bruno Pontecorvo 3, 56127 Pisa, Italy.
  • Paoli G; Dipartimento di Biologia, University of Pisa, Via Ghini 13, 56126 Pisa, Italy.
  • Tofanelli S; Dipartimento di Biologia, University of Pisa, Via Ghini 13, 56126 Pisa, Italy.
PLoS One ; 10(10): e0140146, 2015.
Article en En | MEDLINE | ID: mdl-26452043
This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447-2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8-30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Genética de Población Tipo de estudio: Prognostic_studies / Qualitative_research / Risk_factors_studies Límite: Humans / Male Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2015 Tipo del documento: Article País de afiliación: Italia Pais de publicación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Genética de Población Tipo de estudio: Prognostic_studies / Qualitative_research / Risk_factors_studies Límite: Humans / Male Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2015 Tipo del documento: Article País de afiliación: Italia Pais de publicación: Estados Unidos