The aim of the experiment is to determine patch initiation rates in Ede1 internal domain deletion mutants. We looked at the patch density and lifetimes of late coat protein Sla1 in Ede1 mutants lacking all or some of the central region. This notebook is devoted to the Sla1 patch density.
I acquired all data on the Olympus IX81 equipped with a 100x/1.49 objective, using the X-Cite 120PC lamp at 100% intensity and 400 ms exposure for illumination. Light was filtered through a U-MGFPHQ filter cube. I acquired stacks of 26 planes with a step size of 0.2 microns.
Individual non-budding cells were cropped from fields of view. Patch numbers were extracted using Python function count_patches
from my personal package mkimage
containing a set of wrappers for scikit-image
functions. Briefly, the images were median-filtered with a 5 px disk brush, and the filtered images were subtracted from the originals to subtract local background. The background-subtracted images were thresholded using the Yen method. The thresholded images were eroded using the number of non-zero neighbouring pixels in 3D as the erosion criterion. The spots were counted using skimage.measure.label() function with 2-connectivity.
Cross-section area was obtained by median-filtering of the stack with 10px disk brush, calculating the maximum projection image, thresholding using Otsu’s algorithm, and using skimage.measure.regionprops() to measure area. Note that these are pixel counts of cross-section area. To determine the total surface area, I assumed that an unbudded cell is spherical (suface area is four times the cross-section area).
This call to site_counter
was used to process all datasets:
process_folder(path, median_radius = 5, erosion_n = 1, con = 2,
method = Yen, mask = False, loop = False, save_images = True)
data_cleanup.Rmd
was used to gather all output into tidy data frames with no further modifications.
strain | ede1 |
---|---|
MKY0140 | wt |
MKY3770 | pq |
MKY3776 | cc |
MKY3782 | pqcc |
MKY0654 | delta |
Despite my best intentions, there were some discrepancies in the illumination settings between datasets. Datasets #1 and #2 were acquired at 100% lamp power, but #1 had a 50% ND filter inserted in the light path and #2 did not, which I did not know at the time.
As a result, the overall signal in dataset #2 was brighter, but there was more bleaching of the further parts of the stack. This might have resulted in undercounting of some patches at the bottom.
With datasets #3 and #4 I was careful not to repeat the same mistake and acquired everything at 50% lamp power.
Therefore datasets 1, 3 and 4 could be seen as ‘canonical’ independent repeats. However, #2 does not ultimately seem like an outlier, so I included it in the final analysis.
ede1 | dataset | n | patches_mean | patches_sd | patches_se | area_mean | area_sd | area_se |
---|---|---|---|---|---|---|---|---|
wt | 1 | 33 | 30.72727 | 7.702567 | 1.3408449 | 57.01005 | 7.899497 | 1.3751260 |
wt | 2 | 52 | 25.28846 | 5.496297 | 0.7621992 | 53.95972 | 13.422453 | 1.8613593 |
wt | 3 | 63 | 29.74603 | 8.090154 | 1.0192636 | 49.81814 | 6.591686 | 0.8304744 |
wt | 4 | 56 | 29.30357 | 6.717466 | 0.8976592 | 53.99737 | 8.254882 | 1.1031050 |
pq | 1 | 49 | 27.34694 | 7.512631 | 1.0732330 | 64.07668 | 9.498606 | 1.3569437 |
pq | 2 | 61 | 23.54098 | 5.687922 | 0.7282638 | 64.85298 | 11.574001 | 1.4818990 |
pq | 3 | 56 | 24.85714 | 7.590493 | 1.0143223 | 62.52588 | 11.025026 | 1.4732810 |
pq | 4 | 56 | 25.08929 | 7.012396 | 0.9370709 | 59.07555 | 9.917223 | 1.3252447 |
cc | 1 | 47 | 22.91489 | 5.356141 | 0.7812735 | 58.75194 | 10.444307 | 1.5234588 |
cc | 2 | 52 | 20.01923 | 5.859433 | 0.8125572 | 56.04337 | 11.722141 | 1.6255685 |
cc | 3 | 56 | 20.98214 | 6.397417 | 0.8548908 | 50.73425 | 10.279776 | 1.3736928 |
cc | 4 | 56 | 21.98214 | 5.937526 | 0.7934354 | 52.72582 | 6.696932 | 0.8949152 |
pqcc | 1 | 56 | 16.08929 | 6.162280 | 0.8234694 | 57.45513 | 8.006104 | 1.0698607 |
pqcc | 2 | 62 | 16.80645 | 5.337276 | 0.6778347 | 59.10695 | 11.661839 | 1.4810550 |
pqcc | 3 | 54 | 16.03704 | 6.933547 | 0.9435362 | 52.34180 | 10.339465 | 1.4070229 |
pqcc | 4 | 49 | 19.30612 | 5.598378 | 0.7997683 | 54.37294 | 9.698409 | 1.3854870 |
delta | 1 | 44 | 14.63636 | 6.324890 | 0.9535130 | 58.88947 | 10.932393 | 1.6481203 |
delta | 2 | 45 | 15.46667 | 5.562047 | 0.8291410 | 56.58791 | 12.111897 | 1.8055350 |
delta | 3 | 52 | 15.09615 | 6.114142 | 0.8478789 | 50.81073 | 8.869104 | 1.2299234 |
delta | 4 | 52 | 17.96154 | 6.293356 | 0.8727315 | 57.26936 | 11.048515 | 1.5321534 |
We can combine the patch number and area into \(density = \frac{patches}{area}\), calculated individually for each cell. We can summarise the data for each Ede1 mutant in each dataset:
ede1 | dataset | n | density_mean | density_sd | density_se | density_median | density_mad |
---|---|---|---|---|---|---|---|
wt | 1 | 33 | 0.5382331 | 0.1249763 | 0.0217556 | 0.5509918 | 0.1424483 |
wt | 2 | 52 | 0.4853156 | 0.1166988 | 0.0161832 | 0.4963251 | 0.1036285 |
wt | 3 | 63 | 0.5931019 | 0.1156752 | 0.0145737 | 0.5885655 | 0.0961262 |
wt | 4 | 56 | 0.5488695 | 0.1227789 | 0.0164070 | 0.5656413 | 0.1345622 |
pq | 1 | 49 | 0.4303778 | 0.1108477 | 0.0158354 | 0.4152123 | 0.0870553 |
pq | 2 | 61 | 0.3707952 | 0.0950338 | 0.0121678 | 0.3782260 | 0.1032522 |
pq | 3 | 56 | 0.4020377 | 0.1236939 | 0.0165293 | 0.3800441 | 0.1032181 |
pq | 4 | 56 | 0.4306633 | 0.1226327 | 0.0163875 | 0.4399799 | 0.0801368 |
cc | 1 | 47 | 0.3942192 | 0.0851579 | 0.0124216 | 0.4025688 | 0.0699532 |
cc | 2 | 52 | 0.3679239 | 0.1097261 | 0.0152163 | 0.3763154 | 0.1089664 |
cc | 3 | 56 | 0.4169928 | 0.1201964 | 0.0160619 | 0.4289882 | 0.1041989 |
cc | 4 | 56 | 0.4174882 | 0.1025519 | 0.0137041 | 0.4003336 | 0.0996016 |
pqcc | 1 | 56 | 0.2796635 | 0.0951736 | 0.0127181 | 0.2875115 | 0.0965174 |
pqcc | 2 | 62 | 0.2906701 | 0.0906537 | 0.0115130 | 0.3039545 | 0.0874739 |
pqcc | 3 | 54 | 0.3030404 | 0.1107896 | 0.0150766 | 0.2889415 | 0.1137708 |
pqcc | 4 | 49 | 0.3595033 | 0.0960130 | 0.0137161 | 0.3714628 | 0.1000371 |
delta | 1 | 44 | 0.2482628 | 0.0983372 | 0.0148249 | 0.2557438 | 0.1065543 |
delta | 2 | 45 | 0.2789393 | 0.0996907 | 0.0148610 | 0.2756539 | 0.0931660 |
delta | 3 | 52 | 0.2934675 | 0.1064340 | 0.0147597 | 0.3275947 | 0.0893750 |
delta | 4 | 52 | 0.3199552 | 0.1152373 | 0.0159805 | 0.3101383 | 0.0986257 |
I have chosen to show this data using the SuperPlot style. Each point shows density of Sla1-EGFP patches in an individual cell.
Big colour points show mean measurements from four independent repeats.
Range is mean +/- SD, calculated based on the four independent repeat means.
Let’s add significance stars based on Tukey’s test.
This is only a subset of comparisons and it’s already cluttered. The alternative is…
In this view, groups sharing at least one letter are not significant at a chosen \(\alpha\) (here, 95%).
Pros:
Cons:
ANOVA and similar parametric tests assume that the errors are normally distributed, with homogeneous variances, and that the samples are independent.
We will test the null hypothesis that mean Sla1 density is the same across different Ede1 strains. We will use repeat-level data for the tests to account for experimental variability.
From the plots it looks like the underlying data is ‘normal enough’, considering that ANOVA can tolerate some departure from normality. We can check the normality of residuals used in the model later, but it might still be interesting to know how normal the underlying data is overall.
If we do a formal test (Shapiro-Wilkes):
ede1 | n | shapiro.p |
---|---|---|
wt | 204 | 0.7403509 |
pq | 222 | 0.4077729 |
cc | 211 | 0.1588376 |
pqcc | 221 | 0.6747696 |
delta | 193 | 0.0978718 |
Q-Q plots:
The data looks quite normal.
4 points per group is probably enough to assess whether the variance is similar in the repeat-level data. Levene’s test:
df1 | df2 | statistic | p |
---|---|---|---|
4 | 15 | 0.1598623 | 0.9553962 |
Levene’s cannot reject the null here (variance does not differ between groups).
Given the null of mean equality, what is the likelihood of obtaining these results?
term | df | sumsq | meansq | statistic | p.value |
---|---|---|---|---|---|
ede1 | 4 | 0.1640381 | 0.0410095 | 37.4362 | 1e-07 |
Residuals | 15 | 0.0164318 | 0.0010955 | NA | NA |
One-way ANOVA rejects the null with \(p = 10^{-7}\).
The residuals look approximately normally distributed (histogram, Q-Q plot) with similar variance (Residuals vs. Fitted, grouped by factor).
Following the rejection of the null by ANOVA, we can use Tukey-Kramer to check pariwse comparisons.
term | group1 | group2 | null.value | estimate | conf.low | conf.high | p.adj | p.adj.signif |
---|---|---|---|---|---|---|---|---|
ede1 | wt | pq | 0 | -0.1329115 | -0.2051799 | -0.0606432 | 3.61e-04 | *** |
ede1 | wt | cc | 0 | -0.1422240 | -0.2144923 | -0.0699556 | 1.77e-04 | *** |
ede1 | wt | pqcc | 0 | -0.2331607 | -0.3054290 | -0.1608923 | 5.00e-07 | **** |
ede1 | wt | delta | 0 | -0.2562238 | -0.3284922 | -0.1839555 | 1.00e-07 | **** |
ede1 | pq | cc | 0 | -0.0093125 | -0.0815808 | 0.0629559 | 9.94e-01 | ns |
ede1 | pq | pqcc | 0 | -0.1002492 | -0.1725175 | -0.0279808 | 5.02e-03 | ** |
ede1 | pq | delta | 0 | -0.1233123 | -0.1955807 | -0.0510440 | 7.69e-04 | *** |
ede1 | cc | pqcc | 0 | -0.0909367 | -0.1632050 | -0.0186683 | 1.09e-02 | * |
ede1 | cc | delta | 0 | -0.1139998 | -0.1862682 | -0.0417315 | 1.62e-03 | ** |
ede1 | pqcc | delta | 0 | -0.0230631 | -0.0953315 | 0.0492052 | 8.58e-01 | ns |
Two comparisons do not produce statistically significant differences:
For all other groups, \(p < 0.05\) (at least); for all comparisons with wild type \(p < 0.001\).
Summary statistics for all experiments, derived from mean values of N independent repeats.
Final estimates with lower / upper 95% confidence intervals and a comparison to wild type (in %). half_ci
is just the error for writing CI ranges in the format mean +/- error.
ede1 | mean | lower | upper | proc_wt | half_ci |
---|---|---|---|---|---|
wt | 0.541 | 0.471 | 0.612 | 100 | 0.070 |
pq | 0.408 | 0.363 | 0.454 | 75 | 0.045 |
cc | 0.399 | 0.362 | 0.437 | 74 | 0.037 |
pqcc | 0.308 | 0.252 | 0.365 | 57 | 0.056 |
delta | 0.285 | 0.238 | 0.333 | 53 | 0.048 |
ede1 | N | mean | sd | se | median | mad |
---|---|---|---|---|---|---|
wt | 4 | 0.541 | 0.044 | 0.022 | 0.544 | 0.041 |
pq | 4 | 0.408 | 0.028 | 0.014 | 0.416 | 0.021 |
cc | 4 | 0.399 | 0.023 | 0.012 | 0.406 | 0.017 |
pqcc | 4 | 0.308 | 0.035 | 0.018 | 0.297 | 0.017 |
delta | 4 | 0.285 | 0.030 | 0.015 | 0.286 | 0.030 |
So far we have mostly looked at the statistics derived from experiment-level means. For completeness, the table below reports number of observations and density statistic derived from pooled observations from all repeats, in each group:
ede1 | n | mean | sd | se | median | mad | 25% | 75% |
---|---|---|---|---|---|---|---|---|
wt | 204 | 0.545 | 0.125 | 0.009 | 0.551 | 0.132 | 0.459 | 0.633 |
pq | 222 | 0.407 | 0.115 | 0.008 | 0.408 | 0.104 | 0.328 | 0.473 |
cc | 211 | 0.400 | 0.107 | 0.007 | 0.399 | 0.099 | 0.332 | 0.465 |
pqcc | 221 | 0.306 | 0.102 | 0.007 | 0.308 | 0.100 | 0.240 | 0.375 |
delta | 193 | 0.287 | 0.108 | 0.008 | 0.289 | 0.107 | 0.218 | 0.361 |
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] multcompView_0.1-8 knitr_1.33 rstatix_0.7.0 broom_0.7.6
## [5] ggsignif_0.6.1 ggbeeswarm_0.6.0 forcats_0.5.1 stringr_1.4.0
## [9] dplyr_1.0.6 purrr_0.3.4 readr_1.4.0 tidyr_1.1.3
## [13] tibble_3.1.2 ggplot2_3.3.3 tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] fs_1.5.0 lubridate_1.7.10 RColorBrewer_1.1-2
## [4] httr_1.4.2 tools_4.1.0 backports_1.2.1
## [7] utf8_1.2.1 R6_2.5.0 rpart_4.1-15
## [10] vipor_0.4.5 Hmisc_4.5-0 DBI_1.1.1
## [13] colorspace_2.0-1 nnet_7.3-16 withr_2.4.2
## [16] gridExtra_2.3 tidyselect_1.1.1 curl_4.3.1
## [19] compiler_4.1.0 cli_2.5.0 rvest_1.0.0
## [22] htmlTable_2.2.1 xml2_1.3.2 labeling_0.4.2
## [25] checkmate_2.0.0 scales_1.1.1 digest_0.6.27
## [28] foreign_0.8-81 rmarkdown_2.8 rio_0.5.26
## [31] base64enc_0.1-3 jpeg_0.1-8.1 pkgconfig_2.0.3
## [34] htmltools_0.5.1.1 dbplyr_2.1.1 highr_0.9
## [37] htmlwidgets_1.5.3 rlang_0.4.11 readxl_1.3.1
## [40] rstudioapi_0.13 farver_2.1.0 generics_0.1.0
## [43] jsonlite_1.7.2 zip_2.2.0 car_3.0-10
## [46] magrittr_2.0.1 Formula_1.2-4 Matrix_1.3-3
## [49] Rcpp_1.0.6 munsell_0.5.0 fansi_0.5.0
## [52] abind_1.4-5 lifecycle_1.0.0 stringi_1.6.2
## [55] yaml_2.2.1 carData_3.0-4 grid_4.1.0
## [58] crayon_1.4.1 lattice_0.20-44 haven_2.4.1
## [61] splines_4.1.0 hms_1.1.0 pillar_1.6.1
## [64] reprex_2.0.0 glue_1.4.2 evaluate_0.14
## [67] latticeExtra_0.6-29 data.table_1.14.0 modelr_0.1.8
## [70] vctrs_0.3.8 png_0.1-7 cellranger_1.1.0
## [73] gtable_0.3.0 assertthat_0.2.1 xfun_0.23
## [76] openxlsx_4.2.3 mime_0.10 survival_3.2-11
## [79] beeswarm_0.3.1 cluster_2.1.2 ellipsis_0.3.2