Tests for checking Batch Effects
Batch 151113 | Batch 170208 | |
---|---|---|
Condition crowned | 12 | 0 |
Condition worker | 6 | 5 |
Standardized Pearson Correlation Coefficient | Cramer’s V | |
---|---|---|
Confounding Coefficients (0=no confounding, 1=complete confounding) | 0.682 | 0.5505 |
Full (Condition+Batch) | Condition | Batch | |
---|---|---|---|
Min. | 0 | 0 | 0 |
1st Qu. | 5.697 | 1.812 | 0.437 |
Median | 10.72 | 5.468 | 1.766 |
Mean | 12.11 | 7.289 | 4.337 |
3rd Qu. | 16.39 | 10.7 | 5.273 |
Max. | 81.33 | 54.47 | 80.83 |
Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Ps<0.05 | |
---|---|---|---|---|---|---|---|
Batch P-values | 4.06e-07 | 0.2077 | 0.4219 | 0.4591 | 0.7031 | 1 | 0.04219 |
Condition P-values | 9.497e-05 | 0.1129 | 0.2481 | 0.3332 | 0.5069 | 1 | 0.09364 |
Boxplots for all values for each of the samples and are colored by batch membership.
Condition: worker (logFC) | AveExpr | t | P.Value | adj.P.Val | B | |
---|---|---|---|---|---|---|
RND3 | 272.2 | 446 | 4.697 | 0.0001297 | 0.4698 | -4.37 |
CDKL2 | 18.67 | 22.17 | 4.531 | 0.0001915 | 0.4698 | -4.379 |
DUSP27 | 36.5 | 36.09 | 4.203 | 0.0004159 | 0.4698 | -4.397 |
GPR146 | 48.5 | 128.2 | 4.086 | 0.0005488 | 0.4698 | -4.404 |
PTPN14 | 69.92 | 181 | 4.059 | 0.0005847 | 0.4698 | -4.405 |
FAT2 | 112.3 | 123 | 4.035 | 0.0006191 | 0.4698 | -4.407 |
FREM2 | 44.17 | 70.78 | 4.002 | 0.0006697 | 0.4698 | -4.409 |
FAM83B | 38.08 | 58.7 | 3.968 | 0.0007262 | 0.4698 | -4.411 |
MFSD6 | 99.58 | 241.1 | 3.966 | 0.0007294 | 0.4698 | -4.411 |
GRIK2 | 44.5 | 105.7 | 3.951 | 0.0007552 | 0.4698 | -4.412 |
This plot helps identify outlying samples.
This is a heatmap of the given data matrix showing the batch effects and variations with different conditions.
This is a heatmap of the correlation between samples.
This is a Circular Dendrogram of the given data matrix colored by batch to show the batch effects.
This is a plot of the top two principal components colored by batch to show the batch effects.
Proportion of Variance (%) | Cumulative Proportion of Variance (%) | Percent Variation Explained by Either Condition or Batch | Percent Variation Explained by Condition | Condition Significance (p-value) | Percent Variation Explained by Batch | Batch Significance (p-value) | |
---|---|---|---|---|---|---|---|
PC1 | 46.33 | 46.33 | 12.6 | 10.1 | 0.1082 | 0.2 | 0.463 |
PC2 | 10.56 | 56.89 | 6.4 | 1.4 | 0.317 | 1.5 | 0.3137 |
PC3 | 9.053 | 65.94 | 8.4 | 0.7 | 0.714 | 7.7 | 0.2124 |
PC4 | 4.48 | 70.42 | 45.3 | 28 | 0.2114 | 40.7 | 0.02073 |
PC5 | 3.771 | 74.19 | 32.8 | 0.6 | 0.1939 | 26.7 | 0.00571 |
PC6 | 3.254 | 77.45 | 17.8 | 16.7 | 0.1797 | 9.9 | 0.6039 |
PC7 | 2.23 | 79.68 | 6.8 | 5.9 | 0.4954 | 4.5 | 0.6632 |
PC8 | 2.035 | 81.71 | 2.3 | 2.1 | 0.6632 | 1.4 | 0.8447 |
PC9 | 1.897 | 83.61 | 0.1 | 0.1 | 0.8981 | 0 | 0.9996 |
PC10 | 1.845 | 85.45 | 6.6 | 0.8 | 0.3508 | 2.3 | 0.2772 |
PC11 | 1.69 | 87.14 | 0.3 | 0.1 | 0.9887 | 0.3 | 0.8558 |
PC12 | 1.534 | 88.68 | 2.1 | 2.1 | 0.5933 | 0.7 | 0.9895 |
PC13 | 1.46 | 90.14 | 10.1 | 5.4 | 0.1547 | 0.3 | 0.3149 |
PC14 | 1.308 | 91.44 | 4.7 | 0.4 | 0.4485 | 1.8 | 0.3588 |
PC15 | 1.297 | 92.74 | 0.1 | 0.1 | 0.9129 | 0 | 0.9998 |
PC16 | 1.232 | 93.97 | 0.3 | 0 | 0.9574 | 0.3 | 0.817 |
PC17 | 1.166 | 95.14 | 7.2 | 3.7 | 0.2356 | 0.2 | 0.4006 |
PC18 | 1.108 | 96.25 | 3.1 | 2.6 | 0.4365 | 0.1 | 0.7397 |
PC19 | 1.064 | 97.31 | 8.1 | 3.3 | 0.2193 | 0.7 | 0.3175 |
PC20 | 0.9734 | 98.28 | 1.8 | 0.3 | 0.6179 | 0.5 | 0.5923 |
PC21 | 0.954 | 99.24 | 21.4 | 13.9 | 0.03049 | 0.1 | 0.1833 |
PC22 | 0.7616 | 100 | 1.8 | 1.6 | 0.5529 | 0.1 | 0.8105 |
PC23 | 4.698e-29 | 100 | 2.3 | 0.5 | 0.5566 | 0.6 | 0.5446 |
This is a heatmap plot showing the variation of gene expression mean, variance, skewness and kurtosis between samples grouped by batch to see the batch effects variation
## Note: Sample-wise p-value is calculated for the variation across samples on the measure across genes. Gene-wise p-value is calculated for the variation of each gene between batches on the measure across each batch. If the data is quantum normalized, then the Sample-wise measure across genes is same for all samples and Gene-wise p-value is a good measure.
This is a plot showing whether parametric or non-parameteric prior is appropriate for this data. It also shows the Kolmogorov-Smirnov test comparing the parametric and non-parameteric prior distribution.
## Found 2 batches
## Adjusting for 1 covariate(s) or covariate level(s)
## Standardizing Data across genes
## Fitting L/S model and finding priors
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(shinyInput$t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(delta.hat[1, ], invgam): p-value will be approximate in the presence of ties
## Batch mean distribution across genes: Normal vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.04411
## p-value = 0
##
##
## Batch Variance distribution across genes: Inverse Gamma vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.159
## p-value = 0Note: The non-parametric version of ComBat takes much longer time to run and we recommend it only when the shape of the non-parametric curve widely differs such as a bimodal or highly skewed distribution. Otherwise, the difference in batch adjustment is very negligible and parametric version is recommended even if p-value of KS test above is significant.
## Number of Surrogate Variables found in the given data: 0