Tests for checking Batch Effects
Batch 150629 | Batch 150722 | Batch 170208 | |
---|---|---|---|
Condition crowned | 4 | 2 | 0 |
Condition worker | 0 | 2 | 3 |
Standardized Pearson Correlation Coefficient | Cramer’s V | |
---|---|---|
Confounding Coefficients (0=no confounding, 1=complete confounding) | 0.8806 | 0.7958 |
Full (Condition+Batch) | Condition | Batch | |
---|---|---|---|
Min. | 0 | 0 | 0 |
1st Qu. | 24.2 | 1.767 | 19.22 |
Median | 38.28 | 7.787 | 33.41 |
Mean | 39.37 | 12.23 | 35 |
3rd Qu. | 53.38 | 19.41 | 49.11 |
Max. | 96.67 | 78.59 | 96.21 |
Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | Ps<0.05 | |
---|---|---|---|---|---|---|---|
Batch P-values | 4.26e-05 | 0.1307 | 0.2864 | 0.3481 | 0.5239 | 1 | 0.09862 |
Condition P-values | 0.002146 | 0.4369 | 0.6676 | 0.6284 | 0.846 | 1 | 0.008761 |
Boxplots for all values for each of the samples and are colored by batch membership.
Condition: worker (logFC) | AveExpr | t | P.Value | adj.P.Val | B | |
---|---|---|---|---|---|---|
XIRP2 | -103 | 42.55 | -3.905 | 0.004956 | 1 | -4.595 |
TIMM9 | -240 | 651.1 | -3.827 | 0.00552 | 1 | -4.595 |
RSRC1 | -110.5 | 388.7 | -3.824 | 0.005543 | 1 | -4.595 |
CCDC28A | -113.5 | 293.5 | -3.615 | 0.007413 | 1 | -4.595 |
HAGHL | -108.5 | 101.7 | -3.605 | 0.007514 | 1 | -4.595 |
NDUFA12 | -117 | 323.7 | -3.602 | 0.007552 | 1 | -4.595 |
RHOBTB2 | -80.5 | 276.5 | -3.421 | 0.009754 | 1 | -4.595 |
ISPD | -70 | 154.8 | -3.377 | 0.01039 | 1 | -4.595 |
VPS51 | -227.5 | 1884 | -3.375 | 0.01043 | 1 | -4.595 |
FAM189A1 | -168.5 | 184.7 | -3.303 | 0.01157 | 1 | -4.595 |
This plot helps identify outlying samples.
This is a heatmap of the given data matrix showing the batch effects and variations with different conditions.
This is a heatmap of the correlation between samples.
This is a Circular Dendrogram of the given data matrix colored by batch to show the batch effects.
This is a plot of the top two principal components colored by batch to show the batch effects.
Proportion of Variance (%) | Cumulative Proportion of Variance (%) | Percent Variation Explained by Either Condition or Batch | Percent Variation Explained by Condition | Condition Significance (p-value) | Percent Variation Explained by Batch | Batch Significance (p-value) | |
---|---|---|---|---|---|---|---|
PC1 | 32.85 | 32.85 | 45.6 | 4.6 | 0.9173 | 45.5 | 0.1399 |
PC2 | 26.29 | 59.14 | 56 | 24.2 | 0.5767 | 53.9 | 0.1489 |
PC3 | 13.14 | 72.28 | 29.6 | 22.8 | 0.4331 | 22.7 | 0.722 |
PC4 | 6.055 | 78.33 | 10.8 | 0.5 | 0.8333 | 10.2 | 0.6817 |
PC5 | 4.794 | 83.13 | 10.2 | 0.9 | 0.8087 | 9.4 | 0.7064 |
PC6 | 4.42 | 87.55 | 34.5 | 3.3 | 0.1131 | 3.8 | 0.2557 |
PC7 | 4.091 | 91.64 | 4.8 | 0.3 | 0.8693 | 4.4 | 0.849 |
PC8 | 3.185 | 94.82 | 27.1 | 9.3 | 0.4941 | 21.7 | 0.4645 |
PC9 | 2.813 | 97.64 | 27.3 | 7.1 | 0.9054 | 27.2 | 0.4239 |
PC10 | 2.363 | 100 | 53.9 | 27.1 | 0.02538 | 1.2 | 0.2008 |
PC11 | 1.214e-28 | 100 | 23.6 | 1.1 | 0.3867 | 14.3 | 0.4047 |
This is a heatmap plot showing the variation of gene expression mean, variance, skewness and kurtosis between samples grouped by batch to see the batch effects variation
## Note: Sample-wise p-value is calculated for the variation across samples on the measure across genes. Gene-wise p-value is calculated for the variation of each gene between batches on the measure across each batch. If the data is quantum normalized, then the Sample-wise measure across genes is same for all samples and Gene-wise p-value is a good measure.
This is a plot showing whether parametric or non-parameteric prior is appropriate for this data. It also shows the Kolmogorov-Smirnov test comparing the parametric and non-parameteric prior distribution.
## Found 3 batches
## Adjusting for 1 covariate(s) or covariate level(s)
## Standardizing Data across genes
## Fitting L/S model and finding priors
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(gamma.hat[1, ], "pnorm", gamma.bar[1], sqrt(shinyInput$t2[1])): ties should not be present for the Kolmogorov-Smirnov test
## Warning in ks.test(delta.hat[1, ], invgam): p-value will be approximate in the presence of ties
## Batch mean distribution across genes: Normal vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.01989
## p-value = 9.227e-06
##
##
## Batch Variance distribution across genes: Inverse Gamma vs Empirical distribution
## Two-sided Kolmogorov-Smirnov test
## Selected Batch: 1
## Statistic D = 0.1364
## p-value = 0Note: The non-parametric version of ComBat takes much longer time to run and we recommend it only when the shape of the non-parametric curve widely differs such as a bimodal or highly skewed distribution. Otherwise, the difference in batch adjustment is very negligible and parametric version is recommended even if p-value of KS test above is significant.
## Number of Surrogate Variables found in the given data: 2