This folder contains the scripts to recreate the analysis from Figure 4 and the Figure 4 Supplements. 

Scripts require that the HMPv35 dataset is downloaded (http://hmpdacc.org/HMQCP/) and file paths in these scripts are updated to reflect the download location. 


The scripts should be run in the following order:

(1) hmp_norarify_data_processing.Rmd - This script will preform data preprocessing on the HMP dataset and output the file HMP.RData which is required by subsequent scripts. 

(2) ilr_var_truncateZeroes_permTree_bysite_hardac (the *.sh calls the *.slurm which executes the *.R script) - These scripts require a linux cluster with the Slurm job scheduler (see header of *.slurm script for other hardware requirements). These scripts primarily compute the null model and test statistics for Figure 4 and output the file bysite.200000.40.RData which is used by subsequent scripts. 

(3) ilr_var_truncateZeroes_permTree_calcpvalues.R - This script calculates FDR corrected p-values for the relationship between phylogenetic depth and balance variance. 

(4) ilr_var_truncateZeroes_permTree_figures.R - This script creates the figure components that go into Figure 4 and the Figure 4 supplements. Note this script requires utility scripts and data from update_taxonomy_and_tax2tree/
