BatchQC Report

Tests for checking Batch Effects

Summary

Confounding

Number of samples in each Batch and Condition

  Batch 151218 Batch 170208
Condition crowned 5 0
Condition worker 2 3

Measures of confounding between Batch and Condition

  Standardized Pearson Correlation Coefficient Cramer’s V
Confounding Coefficients (0=no confounding, 1=complete confounding) 0.7746 0.6547

Variation Analysis

Variation explained by Batch and Condition

  Full (Condition+Batch) Condition Batch
Min. 0 0 0
1st Qu. 14.36 3.674 3.325
Median 30 13.99 12.48
Mean 32.74 20.11 18.91
3rd Qu. 48.8 31.64 29.58
Max. 96.13 91.12 95.62

P-value Analysis

Distribution of Batch and Condition Effect p-values Across Genes

  Min. 1st Qu. Median Mean 3rd Qu. Max. Ps<0.05
Batch P-values 2.159e-05 0.1816 0.4337 0.4525 0.7117 1 0.0882
Condition P-values 1.268e-05 0.1597 0.4099 0.4377 0.6989 1 0.1012

Differential Expression

Expression Plot

Boxplots for all values for each of the samples and are colored by batch membership.

LIMMA

  Condition: worker (logFC) AveExpr t P.Value adj.P.Val B
GCNT4 273.3 254.6 8.823 2.874e-05 0.2645 -4.405
CD164L2 95.9 85 7.746 7.099e-05 0.2645 -4.411
SNAPC2 272.6 728.3 7.516 8.717e-05 0.2645 -4.412
NRIP3 81.5 71.9 7.412 9.587e-05 0.2645 -4.413
MYOZ3 94.8 81.1 7.104 0.0001278 0.2645 -4.415
CXCL2 280.2 123.9 6.69 0.0001909 0.2645 -4.419
SEZ6 131.5 148 6.622 0.0002043 0.2645 -4.42
PLB1 130.1 157 6.445 0.0002442 0.2645 -4.421
DDIT3 194 615.7 6.394 0.0002572 0.2645 -4.422
GALNT15 97.6 101.1 6.391 0.000258 0.2645 -4.422

Median Correlations

This plot helps identify outlying samples.

Heatmaps

Heatmap

This is a heatmap of the given data matrix showing the batch effects and variations with different conditions.

Sample Correlations

This is a heatmap of the correlation between samples.

Circular Dendrogram

This is a Circular Dendrogram of the given data matrix colored by batch to show the batch effects.

PCA: Principal Component Analysis

PCA

This is a plot of the top two principal components colored by batch to show the batch effects.

Explained Variation

  Proportion of Variance (%) Cumulative Proportion of Variance (%) Percent Variation Explained by Either Condition or Batch Percent Variation Explained by Condition Condition Significance (p-value) Percent Variation Explained by Batch Batch Significance (p-value)
PC1 25.47 25.47 67.9 62.3 0.07814 48.4 0.3053
PC2 18.9 44.37 17.1 2.5 0.7181 15.5 0.3037
PC3 12.84 57.21 24.6 21 0.1941 2.4 0.5788
PC4 11.12 68.33 67.5 2 0.02121 26.9 0.0071
PC5 8.252 76.58 9.8 5.6 0.4121 0 0.5879
PC6 6.815 83.4 1.7 0.5 0.7509 0.1 0.7783
PC7 6.108 89.51 3.6 3 0.6392 0.3 0.8386
PC8 5.43 94.94 2 2 0.7946 1 0.981
PC9 5.063 100 5.7 1.1 0.8685 5.3 0.5745
PC10 2.975e-29 100 11 2.9 0.8737 10.7 0.4499

Shape

This is a heatmap plot showing the variation of gene expression mean, variance, skewness and kurtosis between samples grouped by batch to see the batch effects variation

## Note: Sample-wise p-value is calculated for the variation across samples on the measure across genes. Gene-wise p-value is calculated for the variation of each gene between batches on the measure across each batch. If the data is quantum normalized, then the Sample-wise measure across genes is same for all samples and Gene-wise p-value is a good measure.

Combat Plots

This is a plot showing whether parametric or non-parameteric prior is appropriate for this data. It also shows the Kolmogorov-Smirnov test comparing the parametric and non-parameteric prior distribution.

## Found 2 batches
## Adjusting for 1 covariate(s) or covariate level(s)
## Standardizing Data across genes
## Fitting L/S model and finding priors

## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(t2[1])): ties should not be present for the Kolmogorov-Smirnov test

## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(shinyInput$t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(delta.hat[1, ], invgam): p-value will be approximate in the presence of ties
## Batch mean distribution across genes: Normal vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.03245
## p-value = 2.42e-14
## 
## 
## Batch Variance distribution across genes: Inverse Gamma vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.1747
## p-value = 0Note: The non-parametric version of ComBat takes much longer time to run and we recommend it only when the shape of the non-parametric curve widely differs such as a bimodal or highly skewed distribution. Otherwise, the difference in batch adjustment is very negligible and parametric version is recommended even if p-value of KS test above is significant.

SVA

Summary

## Number of Surrogate Variables found in the given data: 0