pixy 2.0.0.beta14

_images/pixy_logo.png

What is pixy?

pixy is a command-line tool for painlessly computing unbiased estimators of population genetic summary statistics that measure genetic variation within (π, θW, Tajima's D) and between (dxy, FST) populations from a VCF.

Many tools for computing these summary statistics from VCFs produce biased estimates in the presence of missing data. This is because they often make the simplifying assumption that if a genotype is missing, it is homozygous reference (0/0) by state. See the pixy paper and the Watterson's θ / Tajima's D follow-up paper for the details.

As of version 2.0, pixy also supports organisms of arbitrary (and variable) ploidy, multiallelic sites, and both .tbi and .csi VCF indexes.

How should I cite pixy?

If you use pixy in your research, please cite the manuscript below, as well the Zenodo DOI of specific version of pixy used for your project.

Manuscript: Korunes, K.L. and Samuk, K. (2021), pixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Molecular Ecology Resources. Accepted Author Manuscript. https://doi.org/10.1111/1755-0998.13326

And, if using the unbiased estimator of Tajima's D or Watterson's theta:

Bailey, N., Stevison, L., & Samuk, K. (2025). Correcting for bias in estimates of θw and Tajima’s D from missing data in next-generation sequencing. Molecular Ecology Resources, e14104. https://doi.org/10.1111/1755-0998.14104

Zenodo DOI for various versions of pixy: Go to https://zenodo.org/record/4432294 and find the DOI that matches the version used (the current version is shown first).