█████╗ ███████╗ ███╗ ███╗ ██████╗
██╔══██╗ ██╔════╝ ████╗ ████║ ██╔════╝
███████║ ███████╗ ██╔████╔██║ ██║
██╔══██║ ╚════██║ ██║╚██╔╝██║ ██║
██║ ██║ ███████║ ██║ ╚═╝ ██║ ╚██████╗
╚═╝ ╚═╝ ╚══════╝ ╚═╝ ╚═╝ ╚═════╝
This page contains genomic annotations from the paper:
 P. Palamara, J. Terhorst, Y. Song, A. Price. Highthroughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability. Nature Genetics, 2018 (article, free fulltext).
The software page can be found here. For any questions or comments, please contact Pier Palamara using <lastname>@stats.ox.ac.uk
. Additional data from the paper may be avaialble upon request.
Bed format annotations

DRC_150 annotation of recent positive selection (hg19 coordinates, computed using UK Biobank data). This annotation was computed using the posterior probability of pairwise coalescent times for all analyzed pairs of individuals in the UKBB data set. The posterior probability tables for all pairs of samples were summed together and renormalized. The DRC_150 statistic is the observed average coalescent rate between generations 0 and 150. The 4th and 5th columns of the bed file report the DRC_150 statistic and the pvalue for each site.

ASMC_avg annotation of background selection (hg19 coordinates, computed using GoNL data). This annotation was computed using the posterior probability of pairwise coalescent times for all pairs of individuals in the GoNL data set. The posterior probability tables for all pairs of samples were summed together and renormalized. This averaged posterior probability was then used to compute the expected coalescent time at each site. The ASMC_avg annotation used in LDSC analyses (see below) involves additional processing steps.

ASMC_med annotation (hg19 coordinates, computed using GoNL data). As above, using median instead of mean.

ASMC_med.het annotation (hg19 coordinates, computed using GoNL data). As above, using median instead of mean. Here, however, we only consider sites where the pair of individuals is heterozygous (genotype = 0,1 or 1,0), so that this quantitiy is related to allele age.
LDScore format annotations
 ASMC_avg annotation of background selection (hg19 coordinates, computed using GoNL data). This annotation contains the same values as the bed format ASMC_avg annotation above, but only for SNPs used in the LDScore analysis, which are also quantilenormalized using 10 minor allele frequency bins (see paper for details). The tarball contains an annot.gz file (ASMC_avg annotation values) and an l2.ldscore.gz file (LDScore values) for each chromosome. This annotation is now included in the baselineLD v2.0 model, which you can find here (version info here). The baselineLD v2.0 model is recommended for LDScore analyses, as it jointly considers a total of 76 genomic annotations.
Average posterior TMRCA values
 UKBB_posteriors. Files containing average posterior TMRCA density inferred for the UK Biobank data set at each site. These can be used to plot the heatmaps in figures 3.b, 3.c, and S7 of the paper, running the script
sh getFigures.sh
included in the tarball, which in turn calls theplotPosteriorHeatMap.py
script.