Tests for checking Batch Effects
Batch 151218 | Batch 170208 | |
---|---|---|
Condition crowned | 5 | 0 |
Condition worker | 2 | 3 |
Standardized Pearson Correlation Coefficient | Cramer’s V | |
---|---|---|
Confounding Coefficients (0=no confounding, 1=complete confounding) | 0.7746 | 0.6547 |
Full (Condition+Batch) | Condition | Batch | |
---|---|---|---|
Min. | 0 | 0 | 0 |
1st Qu. | 14.36 | 3.674 | 3.325 |
Median | 30 | 13.99 | 12.48 |
Mean | 32.74 | 20.11 | 18.91 |
3rd Qu. | 48.8 | 31.64 | 29.58 |
Max. | 96.13 | 91.12 | 95.62 |
Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Ps<0.05 | |
---|---|---|---|---|---|---|---|
Batch P-values | 2.159e-05 | 0.1816 | 0.4337 | 0.4525 | 0.7117 | 1 | 0.0882 |
Condition P-values | 1.268e-05 | 0.1597 | 0.4099 | 0.4377 | 0.6989 | 1 | 0.1012 |
Boxplots for all values for each of the samples and are colored by batch membership.
Condition: worker (logFC) | AveExpr | t | P.Value | adj.P.Val | B | |
---|---|---|---|---|---|---|
GCNT4 | 273.3 | 254.6 | 8.823 | 2.874e-05 | 0.2645 | -4.405 |
CD164L2 | 95.9 | 85 | 7.746 | 7.099e-05 | 0.2645 | -4.411 |
SNAPC2 | 272.6 | 728.3 | 7.516 | 8.717e-05 | 0.2645 | -4.412 |
NRIP3 | 81.5 | 71.9 | 7.412 | 9.587e-05 | 0.2645 | -4.413 |
MYOZ3 | 94.8 | 81.1 | 7.104 | 0.0001278 | 0.2645 | -4.415 |
CXCL2 | 280.2 | 123.9 | 6.69 | 0.0001909 | 0.2645 | -4.419 |
SEZ6 | 131.5 | 148 | 6.622 | 0.0002043 | 0.2645 | -4.42 |
PLB1 | 130.1 | 157 | 6.445 | 0.0002442 | 0.2645 | -4.421 |
DDIT3 | 194 | 615.7 | 6.394 | 0.0002572 | 0.2645 | -4.422 |
GALNT15 | 97.6 | 101.1 | 6.391 | 0.000258 | 0.2645 | -4.422 |
This plot helps identify outlying samples.
This is a heatmap of the given data matrix showing the batch effects and variations with different conditions.
This is a heatmap of the correlation between samples.
This is a Circular Dendrogram of the given data matrix colored by batch to show the batch effects.
This is a plot of the top two principal components colored by batch to show the batch effects.
Proportion of Variance (%) | Cumulative Proportion of Variance (%) | Percent Variation Explained by Either Condition or Batch | Percent Variation Explained by Condition | Condition Significance (p-value) | Percent Variation Explained by Batch | Batch Significance (p-value) | |
---|---|---|---|---|---|---|---|
PC1 | 25.47 | 25.47 | 67.9 | 62.3 | 0.07814 | 48.4 | 0.3053 |
PC2 | 18.9 | 44.37 | 17.1 | 2.5 | 0.7181 | 15.5 | 0.3037 |
PC3 | 12.84 | 57.21 | 24.6 | 21 | 0.1941 | 2.4 | 0.5788 |
PC4 | 11.12 | 68.33 | 67.5 | 2 | 0.02121 | 26.9 | 0.0071 |
PC5 | 8.252 | 76.58 | 9.8 | 5.6 | 0.4121 | 0 | 0.5879 |
PC6 | 6.815 | 83.4 | 1.7 | 0.5 | 0.7509 | 0.1 | 0.7783 |
PC7 | 6.108 | 89.51 | 3.6 | 3 | 0.6392 | 0.3 | 0.8386 |
PC8 | 5.43 | 94.94 | 2 | 2 | 0.7946 | 1 | 0.981 |
PC9 | 5.063 | 100 | 5.7 | 1.1 | 0.8685 | 5.3 | 0.5745 |
PC10 | 2.975e-29 | 100 | 11 | 2.9 | 0.8737 | 10.7 | 0.4499 |
This is a heatmap plot showing the variation of gene expression mean, variance, skewness and kurtosis between samples grouped by batch to see the batch effects variation
## Note: Sample-wise p-value is calculated for the variation across samples on the measure across genes. Gene-wise p-value is calculated for the variation of each gene between batches on the measure across each batch. If the data is quantum normalized, then the Sample-wise measure across genes is same for all samples and Gene-wise p-value is a good measure.
This is a plot showing whether parametric or non-parameteric prior is appropriate for this data. It also shows the Kolmogorov-Smirnov test comparing the parametric and non-parameteric prior distribution.
## Found 2 batches
## Adjusting for 1 covariate(s) or covariate level(s)
## Standardizing Data across genes
## Fitting L/S model and finding priors
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(shinyInput$t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(delta.hat[1, ], invgam): p-value will be approximate in the presence of ties
## Batch mean distribution across genes: Normal vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.03245
## p-value = 2.42e-14
##
##
## Batch Variance distribution across genes: Inverse Gamma vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.1747
## p-value = 0Note: The non-parametric version of ComBat takes much longer time to run and we recommend it only when the shape of the non-parametric curve widely differs such as a bimodal or highly skewed distribution. Otherwise, the difference in batch adjustment is very negligible and parametric version is recommended even if p-value of KS test above is significant.
## Number of Surrogate Variables found in the given data: 0