Application of t-SNE to human genetic data.
J Bioinform Comput Biol
; 15(4): 1750017, 2017 Aug.
Article
en En
| MEDLINE
| ID: mdl-28718343
ABSTRACT
The t-distributed stochastic neighbor embedding t-SNE is a new dimension reduction and visualization technique for high-dimensional data. t-SNE is rarely applied to human genetic data, even though it is commonly used in other data-intensive biological fields, such as single-cell genomics. We explore the applicability of t-SNE to human genetic data and make these observations (i) similar to previously used dimension reduction techniques such as principal component analysis (PCA), t-SNE is able to separate samples from different continents; (ii) unlike PCA, t-SNE is more robust with respect to the presence of outliers; (iii) t-SNE is able to display both continental and sub-continental patterns in a single plot. We conclude that the ability for t-SNE to reveal population stratification at different scales could be useful for human genetic association studies.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Gráficos por Computador
/
Biología Computacional
/
Polimorfismo de Nucleótido Simple
/
Estudio de Asociación del Genoma Completo
/
Genética Humana
/
Genética de Población
Límite:
Humans
Idioma:
En
Revista:
J Bioinform Comput Biol
Asunto de la revista:
BIOLOGIA
/
INFORMATICA MEDICA
Año:
2017
Tipo del documento:
Article
País de afiliación:
Estados Unidos