Variances and covariances of linear summary statistics of segregating sites.
Theor Popul Biol
; 145: 95-108, 2022 06.
Article
em En
| MEDLINE
| ID: mdl-35390435
ABSTRACT
Each mutation in a population sample of DNA sequences can be classified by the number of sequences that inherit the mutant nucleotide, the resulting frequencies are known as mutations of different sizes or site frequency spectrum. Many summary statistics can be defined as a linear function of these frequencies. A flexible class of such linear summary statistics is explored analytically in this paper which include several well-known quantities, such as the number of segregating sizes and the mean number of nucleotide differences between two sequences. Some asymptotic variances and covariances are obtained while the analytical formulas for the variances and covariances of nine such linear summary statistics are derived, most of which are unknown to date. This study not only provides some theoretical foundations for exploring linear summary statistics, but also provides some newlinear summary statistics that may be utilized for analyzing sample polymorphism. Furthermore it is showed that a newly developed linear summary statistics has a smaller variance almost uniformly than Watterson's estimator, and that a class of linear summary statistics given too heavy weights on mutations of smaller sizes result in asymptotically non-zero variance.
Palavras-chave
Texto completo:
1
Bases de dados:
MEDLINE
Assunto principal:
Polimorfismo Genético
/
Nucleotídeos
Idioma:
En
Revista:
Theor Popul Biol
Ano de publicação:
2022
Tipo de documento:
Article