Tests for checking Batch Effects
Batch 150629 | Batch 150722 | Batch 151218 | Batch 170217 | |
---|---|---|---|---|
Condition crowned | 4 | 3 | 5 | 0 |
Condition worker | 0 | 3 | 2 | 6 |
Standardized Pearson Correlation Coefficient | Cramer’s V | |
---|---|---|
Confounding Coefficients (0=no confounding, 1=complete confounding) | 0.8108 | 0.6998 |
Full (Condition+Batch) | Condition | Batch | |
---|---|---|---|
Min. | 0.41 | 0 | 0.212 |
1st Qu. | 26.32 | 0.522 | 24.65 |
Median | 37.62 | 2.354 | 36.32 |
Mean | 38.38 | 5.061 | 36.97 |
3rd Qu. | 49.85 | 7.007 | 48.84 |
Max. | 91.34 | 47.12 | 89.81 |
Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Ps<0.05 | |
---|---|---|---|---|---|---|---|
Batch P-values | 1.041e-09 | 0.008028 | 0.05534 | 0.1528 | 0.2107 | 0.9985 | 0.4851 |
Condition P-values | 0.001539 | 0.4855 | 0.6946 | 0.654 | 0.8553 | 1 | 0.006726 |
Boxplots for all values for each of the samples and are colored by batch membership.
Condition: worker (logFC) | AveExpr | t | P.Value | adj.P.Val | B | |
---|---|---|---|---|---|---|
FCGR1A | 28.76 | 66.61 | 2.995 | 0.007558 | 1 | -4.595 |
DDO | -4495 | 4271 | -2.949 | 0.008366 | 1 | -4.595 |
SLC9A4 | 34.07 | 55.43 | 2.918 | 0.008946 | 1 | -4.595 |
NFATC2 | 22.24 | 79.26 | 2.742 | 0.01312 | 1 | -4.595 |
UNG | 34.51 | 86.65 | 2.724 | 0.01363 | 1 | -4.595 |
PCDHB9 | -32.66 | 45.26 | -2.702 | 0.01429 | 1 | -4.595 |
FMNL1 | 121.4 | 531.3 | 2.535 | 0.02038 | 1 | -4.595 |
PPP1R3E | 9.073 | 10.74 | 2.523 | 0.02091 | 1 | -4.595 |
BIRC3 | 29.76 | 73.13 | 2.505 | 0.0217 | 1 | -4.595 |
PPFIBP2 | 143.5 | 572.3 | 2.488 | 0.02249 | 1 | -4.595 |
This plot helps identify outlying samples.
This is a heatmap of the given data matrix showing the batch effects and variations with different conditions.
This is a heatmap of the correlation between samples.
This is a Circular Dendrogram of the given data matrix colored by batch to show the batch effects.
This is a plot of the top two principal components colored by batch to show the batch effects.
Proportion of Variance (%) | Cumulative Proportion of Variance (%) | Percent Variation Explained by Either Condition or Batch | Percent Variation Explained by Condition | Condition Significance (p-value) | Percent Variation Explained by Batch | Batch Significance (p-value) | |
---|---|---|---|---|---|---|---|
PC1 | 51.24 | 51.24 | 44.2 | 0.8 | 0.8729 | 44.1 | 0.01387 |
PC2 | 12.06 | 63.3 | 82.4 | 31.7 | 0.4103 | 81.7 | 2e-05 |
PC3 | 7.245 | 70.54 | 9.3 | 0.5 | 0.5679 | 7.6 | 0.6331 |
PC4 | 4.362 | 74.91 | 12.4 | 0.9 | 0.3639 | 8.2 | 0.515 |
PC5 | 3.722 | 78.63 | 64.8 | 0.6 | 0.917 | 64.8 | 0.00026 |
PC6 | 2.563 | 81.19 | 8.8 | 5 | 0.5967 | 7.4 | 0.8584 |
PC7 | 2.031 | 83.22 | 6.5 | 0.3 | 0.8703 | 6.4 | 0.7546 |
PC8 | 1.781 | 85 | 7 | 0 | 0.4276 | 3.6 | 0.72 |
PC9 | 1.654 | 86.66 | 10.4 | 7 | 0.1768 | 0.6 | 0.8766 |
PC10 | 1.538 | 88.19 | 6.7 | 1.6 | 0.9394 | 6.6 | 0.8079 |
PC11 | 1.492 | 89.69 | 8.9 | 0 | 0.4492 | 5.9 | 0.6309 |
PC12 | 1.41 | 91.1 | 7.2 | 0.4 | 0.399 | 3.3 | 0.7282 |
PC13 | 1.155 | 92.25 | 2.7 | 1.1 | 0.7166 | 1.9 | 0.9595 |
PC14 | 1.069 | 93.32 | 7.1 | 3.6 | 0.8079 | 6.8 | 0.8805 |
PC15 | 1.055 | 94.37 | 18.3 | 9.9 | 0.4098 | 15 | 0.6139 |
PC16 | 1.008 | 95.38 | 33.9 | 16.5 | 0.01437 | 6.9 | 0.231 |
PC17 | 0.873 | 96.26 | 4.1 | 1.5 | 0.9639 | 4.1 | 0.9214 |
PC18 | 0.8531 | 97.11 | 6.7 | 1.3 | 0.7784 | 6.3 | 0.7912 |
PC19 | 0.788 | 97.9 | 28.8 | 1.7 | 0.05399 | 11.9 | 0.1139 |
PC20 | 0.7564 | 98.65 | 2.1 | 0.3 | 0.9064 | 2.1 | 0.9506 |
PC21 | 0.6867 | 99.34 | 19.8 | 12.5 | 0.07855 | 4.3 | 0.6571 |
PC22 | 0.6594 | 100 | 8.1 | 2.9 | 0.2435 | 0.7 | 0.7983 |
PC23 | 4.987e-29 | 100 | 30.6 | 21.6 | 0.3018 | 26.3 | 0.5221 |
This is a heatmap plot showing the variation of gene expression mean, variance, skewness and kurtosis between samples grouped by batch to see the batch effects variation
## Note: Sample-wise p-value is calculated for the variation across samples on the measure across genes. Gene-wise p-value is calculated for the variation of each gene between batches on the measure across each batch. If the data is quantum normalized, then the Sample-wise measure across genes is same for all samples and Gene-wise p-value is a good measure.
This is a plot showing whether parametric or non-parameteric prior is appropriate for this data. It also shows the Kolmogorov-Smirnov test comparing the parametric and non-parameteric prior distribution.
## Found 4 batches
## Adjusting for 1 covariate(s) or covariate level(s)
## Standardizing Data across genes
## Fitting L/S model and finding priors
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(shinyInput$t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(delta.hat[1, ], invgam): p-value will be approximate in the presence of ties
## Batch mean distribution across genes: Normal vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.043
## p-value = 0
##
##
## Batch Variance distribution across genes: Inverse Gamma vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.1355
## p-value = 0Note: The non-parametric version of ComBat takes much longer time to run and we recommend it only when the shape of the non-parametric curve widely differs such as a bimodal or highly skewed distribution. Otherwise, the difference in batch adjustment is very negligible and parametric version is recommended even if p-value of KS test above is significant.
## Number of Surrogate Variables found in the given data: 2