Tests for checking Batch Effects
Batch 150629 | Batch 150722 | Batch 151218 | |
---|---|---|---|
Condition crowned | 4 | 3 | 5 |
Condition worker | 0 | 3 | 2 |
Standardized Pearson Correlation Coefficient | Cramer’s V | |
---|---|---|
Confounding Coefficients (0=no confounding, 1=complete confounding) | 0.5394 | 0.4126 |
Full (Condition+Batch) | Condition | Batch | |
---|---|---|---|
Min. | 0.123 | 0 | 0 |
1st Qu. | 10.78 | 0.176 | 7.822 |
Median | 15.68 | 0.883 | 12.21 |
Mean | 17.94 | 2.754 | 14.28 |
3rd Qu. | 22.47 | 3.241 | 18.49 |
Max. | 82.73 | 59.89 | 81.65 |
Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Ps<0.05 | |
---|---|---|---|---|---|---|---|
Batch P-values | 1.385e-05 | 0.2384 | 0.3798 | 0.4021 | 0.5415 | 0.9999 | 0.0397 |
Condition P-values | 0.0005223 | 0.4024 | 0.5892 | 0.5786 | 0.7781 | 1 | 0.01303 |
Boxplots for all values for each of the samples and are colored by batch membership.
Condition: worker (logFC) | AveExpr | t | P.Value | adj.P.Val | B | |
---|---|---|---|---|---|---|
PCSK1N | 387.3 | 502.2 | 3.969 | 0.001486 | 1 | -4.595 |
RHOV | -24.68 | 36.29 | -3.91 | 0.001665 | 1 | -4.595 |
OR2F1 | -495.2 | 1277 | -3.593 | 0.003082 | 1 | -4.595 |
HBB | -181.8 | 275.9 | -3.581 | 0.003155 | 1 | -4.595 |
ABLIM2 | 42.34 | 54.94 | 3.454 | 0.004046 | 1 | -4.595 |
KCNJ15 | -51.44 | 106.9 | -3.392 | 0.004571 | 1 | -4.595 |
WNT4 | -67.54 | 90.12 | -3.387 | 0.004617 | 1 | -4.595 |
NEURL1B | -42 | 117 | -3.178 | 0.006949 | 1 | -4.595 |
DMBT1 | 36567 | 20833 | 3.064 | 0.008687 | 1 | -4.595 |
CPVL | 340.2 | 261.6 | 3.064 | 0.008691 | 1 | -4.595 |
This plot helps identify outlying samples.
This is a heatmap of the given data matrix showing the batch effects and variations with different conditions.
This is a heatmap of the correlation between samples.
This is a Circular Dendrogram of the given data matrix colored by batch to show the batch effects.
This is a plot of the top two principal components colored by batch to show the batch effects.
Proportion of Variance (%) | Cumulative Proportion of Variance (%) | Percent Variation Explained by Either Condition or Batch | Percent Variation Explained by Condition | Condition Significance (p-value) | Percent Variation Explained by Batch | Batch Significance (p-value) | |
---|---|---|---|---|---|---|---|
PC1 | 55.3 | 55.3 | 12.2 | 0 | 0.6149 | 10.4 | 0.4291 |
PC2 | 8.733 | 64.03 | 43.4 | 12.2 | 0.1341 | 32.3 | 0.05782 |
PC3 | 6.341 | 70.37 | 45.8 | 3.3 | 0.7231 | 45.2 | 0.02334 |
PC4 | 5.836 | 76.21 | 25.6 | 0.3 | 0.712 | 24.8 | 0.1493 |
PC5 | 3.609 | 79.82 | 2.1 | 1 | 0.8599 | 1.8 | 0.9328 |
PC6 | 3.164 | 82.98 | 8.6 | 1.9 | 0.3946 | 3.2 | 0.6323 |
PC7 | 2.765 | 85.74 | 8.3 | 1.7 | 0.4393 | 3.8 | 0.6336 |
PC8 | 1.958 | 87.7 | 32.9 | 30.1 | 0.0263 | 0.5 | 0.7651 |
PC9 | 1.933 | 89.64 | 24.7 | 12.8 | 0.1323 | 9.8 | 0.3859 |
PC10 | 1.823 | 91.46 | 2.2 | 0.7 | 0.8576 | 2 | 0.9061 |
PC11 | 1.72 | 93.18 | 8.4 | 4.7 | 0.5749 | 6.1 | 0.7741 |
PC12 | 1.659 | 94.84 | 15 | 3.2 | 0.2593 | 6 | 0.4289 |
PC13 | 1.523 | 96.36 | 2.5 | 0.3 | 0.9218 | 2.5 | 0.8635 |
PC14 | 1.37 | 97.73 | 21.3 | 0 | 0.4087 | 16.9 | 0.212 |
PC15 | 1.168 | 98.9 | 34.6 | 15.6 | 0.4315 | 31.3 | 0.1909 |
PC16 | 1.102 | 100 | 12.3 | 11.8 | 0.2757 | 3.5 | 0.9698 |
PC17 | 6.851e-29 | 100 | 33.9 | 23.5 | 0.1607 | 22.6 | 0.3892 |
This is a heatmap plot showing the variation of gene expression mean, variance, skewness and kurtosis between samples grouped by batch to see the batch effects variation
## Note: Sample-wise p-value is calculated for the variation across samples on the measure across genes. Gene-wise p-value is calculated for the variation of each gene between batches on the measure across each batch. If the data is quantum normalized, then the Sample-wise measure across genes is same for all samples and Gene-wise p-value is a good measure.
This is a plot showing whether parametric or non-parameteric prior is appropriate for this data. It also shows the Kolmogorov-Smirnov test comparing the parametric and non-parameteric prior distribution.
## Found 3 batches
## Adjusting for 1 covariate(s) or covariate level(s)
## Standardizing Data across genes
## Fitting L/S model and finding priors
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(shinyInput$t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(delta.hat[1, ], invgam): p-value will be approximate in the presence of ties
## Batch mean distribution across genes: Normal vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.1042
## p-value = 0
##
##
## Batch Variance distribution across genes: Inverse Gamma vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.1983
## p-value = 0Note: The non-parametric version of ComBat takes much longer time to run and we recommend it only when the shape of the non-parametric curve widely differs such as a bimodal or highly skewed distribution. Otherwise, the difference in batch adjustment is very negligible and parametric version is recommended even if p-value of KS test above is significant.
## Number of Surrogate Variables found in the given data: 1