Experiment

Rationale

The aim of the experiment is to determine patch initiation rates in Ede1 internal domain deletion mutants. We looked at the patch density and lifetimes of late coat protein Sla1 in Ede1 mutants lacking all or some of the central region. This notebook is devoted to the Sla1 patch density.

Illumination settings

I acquired all data on the Olympus IX81 equipped with a 100x/1.49 objective, using the X-Cite 120PC lamp at 100% intensity and 400 ms exposure for illumination. Light was filtered through a U-MGFPHQ filter cube. I acquired stacks of 26 planes with a step size of 0.2 microns.

Image processing

Individual non-budding cells were cropped from fields of view. Patch numbers were extracted using Python function count_patches from my personal package mkimage containing a set of wrappers for scikit-image functions. Briefly, the images were median-filtered with a 5 px disk brush, and the filtered images were subtracted from the originals to subtract local background. The background-subtracted images were thresholded using the Yen method. The thresholded images were eroded using the number of non-zero neighbouring pixels in 3D as the erosion criterion. The spots were counted using skimage.measure.label() function with 2-connectivity.

Cross-section area was obtained by median-filtering of the stack with 10px disk brush, calculating the maximum projection image, thresholding using Otsu’s algorithm, and using skimage.measure.regionprops() to measure area. Note that these are pixel counts of cross-section area. To determine the total surface area, I assumed that an unbudded cell is spherical (suface area is four times the cross-section area).

This call to site_counter was used to process all datasets:

process_folder(path, median_radius = 5, erosion_n = 1, con = 2,
                   method = Yen, mask = False, loop = False, save_images = True)

data_cleanup.Rmd was used to gather all output into tidy data frames with no further modifications.

List of strains used

strain ede1
MKY0140 wt
MKY3770 pq
MKY3776 cc
MKY3782 pqcc
MKY0654 delta

Addendum on replicates

Despite my best intentions, there were some discrepancies in the illumination settings between datasets. Datasets #1 and #2 were acquired at 100% lamp power, but #1 had a 50% ND filter inserted in the light path and #2 did not, which I did not know at the time.

As a result, the overall signal in dataset #2 was brighter, but there was more bleaching of the further parts of the stack. This might have resulted in undercounting of some patches at the bottom.

With datasets #3 and #4 I was careful not to repeat the same mistake and acquired everything at 50% lamp power.

Therefore datasets 1, 3 and 4 could be seen as ‘canonical’ independent repeats. However, #2 does not ultimately seem like an outlier, so I included it in the final analysis.

Per-dataset summary

Patch number and area

ede1 dataset n patches_mean patches_sd patches_se area_mean area_sd area_se
wt 1 33 30.72727 7.702567 1.3408449 57.01005 7.899497 1.3751260
wt 2 52 25.28846 5.496297 0.7621992 53.95972 13.422453 1.8613593
wt 3 63 29.74603 8.090154 1.0192636 49.81814 6.591686 0.8304744
wt 4 56 29.30357 6.717466 0.8976592 53.99737 8.254882 1.1031050
pq 1 49 27.34694 7.512631 1.0732330 64.07668 9.498606 1.3569437
pq 2 61 23.54098 5.687922 0.7282638 64.85298 11.574001 1.4818990
pq 3 56 24.85714 7.590493 1.0143223 62.52588 11.025026 1.4732810
pq 4 56 25.08929 7.012396 0.9370709 59.07555 9.917223 1.3252447
cc 1 47 22.91489 5.356141 0.7812735 58.75194 10.444307 1.5234588
cc 2 52 20.01923 5.859433 0.8125572 56.04337 11.722141 1.6255685
cc 3 56 20.98214 6.397417 0.8548908 50.73425 10.279776 1.3736928
cc 4 56 21.98214 5.937526 0.7934354 52.72582 6.696932 0.8949152
pqcc 1 56 16.08929 6.162280 0.8234694 57.45513 8.006104 1.0698607
pqcc 2 62 16.80645 5.337276 0.6778347 59.10695 11.661839 1.4810550
pqcc 3 54 16.03704 6.933547 0.9435362 52.34180 10.339465 1.4070229
pqcc 4 49 19.30612 5.598378 0.7997683 54.37294 9.698409 1.3854870
delta 1 44 14.63636 6.324890 0.9535130 58.88947 10.932393 1.6481203
delta 2 45 15.46667 5.562047 0.8291410 56.58791 12.111897 1.8055350
delta 3 52 15.09615 6.114142 0.8478789 50.81073 8.869104 1.2299234
delta 4 52 17.96154 6.293356 0.8727315 57.26936 11.048515 1.5321534

Sla1 density

We can combine the patch number and area into \(density = \frac{patches}{area}\), calculated individually for each cell. We can summarise the data for each Ede1 mutant in each dataset:

ede1 dataset n density_mean density_sd density_se density_median density_mad
wt 1 33 0.5382331 0.1249763 0.0217556 0.5509918 0.1424483
wt 2 52 0.4853156 0.1166988 0.0161832 0.4963251 0.1036285
wt 3 63 0.5931019 0.1156752 0.0145737 0.5885655 0.0961262
wt 4 56 0.5488695 0.1227789 0.0164070 0.5656413 0.1345622
pq 1 49 0.4303778 0.1108477 0.0158354 0.4152123 0.0870553
pq 2 61 0.3707952 0.0950338 0.0121678 0.3782260 0.1032522
pq 3 56 0.4020377 0.1236939 0.0165293 0.3800441 0.1032181
pq 4 56 0.4306633 0.1226327 0.0163875 0.4399799 0.0801368
cc 1 47 0.3942192 0.0851579 0.0124216 0.4025688 0.0699532
cc 2 52 0.3679239 0.1097261 0.0152163 0.3763154 0.1089664
cc 3 56 0.4169928 0.1201964 0.0160619 0.4289882 0.1041989
cc 4 56 0.4174882 0.1025519 0.0137041 0.4003336 0.0996016
pqcc 1 56 0.2796635 0.0951736 0.0127181 0.2875115 0.0965174
pqcc 2 62 0.2906701 0.0906537 0.0115130 0.3039545 0.0874739
pqcc 3 54 0.3030404 0.1107896 0.0150766 0.2889415 0.1137708
pqcc 4 49 0.3595033 0.0960130 0.0137161 0.3714628 0.1000371
delta 1 44 0.2482628 0.0983372 0.0148249 0.2557438 0.1065543
delta 2 45 0.2789393 0.0996907 0.0148610 0.2756539 0.0931660
delta 3 52 0.2934675 0.1064340 0.0147597 0.3275947 0.0893750
delta 4 52 0.3199552 0.1152373 0.0159805 0.3101383 0.0986257

Plots

SuperPlots

I have chosen to show this data using the SuperPlot style. Each point shows density of Sla1-EGFP patches in an individual cell.

Big colour points show mean measurements from four independent repeats.

Range is mean +/- SD, calculated based on the four independent repeat means.

Beeswarm

Violin

With significance

Let’s add significance stars based on Tukey’s test.

This is only a subset of comparisons and it’s already cluttered. The alternative is…

Letter annotations

In this view, groups sharing at least one letter are not significant at a chosen \(\alpha\) (here, 95%).

Pros:

  • less cluttered and simpler to read (5 groups = 10 comparisons)
  • focus away from p-values; provide a binary decision on the null hypothesis

Cons:

  • cannot distinguish different confidence levels

Hypothesis testing

Assumptions

ANOVA and similar parametric tests assume that the errors are normally distributed, with homogeneous variances, and that the samples are independent.

We will test the null hypothesis that mean Sla1 density is the same across different Ede1 strains. We will use repeat-level data for the tests to account for experimental variability.

Normality

From the plots it looks like the underlying data is ‘normal enough’, considering that ANOVA can tolerate some departure from normality. We can check the normality of residuals used in the model later, but it might still be interesting to know how normal the underlying data is overall.

If we do a formal test (Shapiro-Wilkes):

ede1 n shapiro.p
wt 204 0.7403509
pq 222 0.4077729
cc 211 0.1588376
pqcc 221 0.6747696
delta 193 0.0978718

Q-Q plots:

The data looks quite normal.

Homoscedasticity

4 points per group is probably enough to assess whether the variance is similar in the repeat-level data. Levene’s test:

df1 df2 statistic p
4 15 0.1598623 0.9553962

Levene’s cannot reject the null here (variance does not differ between groups).

One-way ANOVA

Given the null of mean equality, what is the likelihood of obtaining these results?

term df sumsq meansq statistic p.value
ede1 4 0.1640381 0.0410095 37.4362 1e-07
Residuals 15 0.0164318 0.0010955 NA NA

One-way ANOVA rejects the null with \(p = 10^{-7}\).

Diagnostic plots

The residuals look approximately normally distributed (histogram, Q-Q plot) with similar variance (Residuals vs. Fitted, grouped by factor).

Post-hoc test (Tukey)

Following the rejection of the null by ANOVA, we can use Tukey-Kramer to check pariwse comparisons.

term group1 group2 null.value estimate conf.low conf.high p.adj p.adj.signif
ede1 wt pq 0 -0.1329115 -0.2051799 -0.0606432 3.61e-04 ***
ede1 wt cc 0 -0.1422240 -0.2144923 -0.0699556 1.77e-04 ***
ede1 wt pqcc 0 -0.2331607 -0.3054290 -0.1608923 5.00e-07 ****
ede1 wt delta 0 -0.2562238 -0.3284922 -0.1839555 1.00e-07 ****
ede1 pq cc 0 -0.0093125 -0.0815808 0.0629559 9.94e-01 ns
ede1 pq pqcc 0 -0.1002492 -0.1725175 -0.0279808 5.02e-03 **
ede1 pq delta 0 -0.1233123 -0.1955807 -0.0510440 7.69e-04 ***
ede1 cc pqcc 0 -0.0909367 -0.1632050 -0.0186683 1.09e-02 *
ede1 cc delta 0 -0.1139998 -0.1862682 -0.0417315 1.62e-03 **
ede1 pqcc delta 0 -0.0230631 -0.0953315 0.0492052 8.58e-01 ns

Two comparisons do not produce statistically significant differences:

  • between ede1∆PQ and ede1∆CC
  • between ede1∆PQCC and ede1∆

For all other groups, \(p < 0.05\) (at least); for all comparisons with wild type \(p < 0.001\).

Overall summary

Summary statistics for all experiments, derived from mean values of N independent repeats.

Final estimates

Final estimates with lower / upper 95% confidence intervals and a comparison to wild type (in %). half_ci is just the error for writing CI ranges in the format mean +/- error.

ede1 mean lower upper proc_wt half_ci
wt 0.541 0.471 0.612 100 0.070
pq 0.408 0.363 0.454 75 0.045
cc 0.399 0.362 0.437 74 0.037
pqcc 0.308 0.252 0.365 57 0.056
delta 0.285 0.238 0.333 53 0.048

Conclusions

  1. All mutations cause a significant reduction in patch density from wild type
  2. Ede1∆PQCC is indistinguishable from full Ede1 deletion, causes ~50% reduction in density
  3. Individual PQ / CC deletions have intermediate defects

More statistics

ede1 N mean sd se median mad
wt 4 0.541 0.044 0.022 0.544 0.041
pq 4 0.408 0.028 0.014 0.416 0.021
cc 4 0.399 0.023 0.012 0.406 0.017
pqcc 4 0.308 0.035 0.018 0.297 0.017
delta 4 0.285 0.030 0.015 0.286 0.030

More statistics (observation-level)

So far we have mostly looked at the statistics derived from experiment-level means. For completeness, the table below reports number of observations and density statistic derived from pooled observations from all repeats, in each group:

ede1 n mean sd se median mad 25% 75%
wt 204 0.545 0.125 0.009 0.551 0.132 0.459 0.633
pq 222 0.407 0.115 0.008 0.408 0.104 0.328 0.473
cc 211 0.400 0.107 0.007 0.399 0.099 0.332 0.465
pqcc 221 0.306 0.102 0.007 0.308 0.100 0.240 0.375
delta 193 0.287 0.108 0.008 0.289 0.107 0.218 0.361

Session info

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] multcompView_0.1-8 knitr_1.33         rstatix_0.7.0      broom_0.7.6       
##  [5] ggsignif_0.6.1     ggbeeswarm_0.6.0   forcats_0.5.1      stringr_1.4.0     
##  [9] dplyr_1.0.6        purrr_0.3.4        readr_1.4.0        tidyr_1.1.3       
## [13] tibble_3.1.2       ggplot2_3.3.3      tidyverse_1.3.1   
## 
## loaded via a namespace (and not attached):
##  [1] fs_1.5.0            lubridate_1.7.10    RColorBrewer_1.1-2 
##  [4] httr_1.4.2          tools_4.1.0         backports_1.2.1    
##  [7] utf8_1.2.1          R6_2.5.0            rpart_4.1-15       
## [10] vipor_0.4.5         Hmisc_4.5-0         DBI_1.1.1          
## [13] colorspace_2.0-1    nnet_7.3-16         withr_2.4.2        
## [16] gridExtra_2.3       tidyselect_1.1.1    curl_4.3.1         
## [19] compiler_4.1.0      cli_2.5.0           rvest_1.0.0        
## [22] htmlTable_2.2.1     xml2_1.3.2          labeling_0.4.2     
## [25] checkmate_2.0.0     scales_1.1.1        digest_0.6.27      
## [28] foreign_0.8-81      rmarkdown_2.8       rio_0.5.26         
## [31] base64enc_0.1-3     jpeg_0.1-8.1        pkgconfig_2.0.3    
## [34] htmltools_0.5.1.1   dbplyr_2.1.1        highr_0.9          
## [37] htmlwidgets_1.5.3   rlang_0.4.11        readxl_1.3.1       
## [40] rstudioapi_0.13     farver_2.1.0        generics_0.1.0     
## [43] jsonlite_1.7.2      zip_2.2.0           car_3.0-10         
## [46] magrittr_2.0.1      Formula_1.2-4       Matrix_1.3-3       
## [49] Rcpp_1.0.6          munsell_0.5.0       fansi_0.5.0        
## [52] abind_1.4-5         lifecycle_1.0.0     stringi_1.6.2      
## [55] yaml_2.2.1          carData_3.0-4       grid_4.1.0         
## [58] crayon_1.4.1        lattice_0.20-44     haven_2.4.1        
## [61] splines_4.1.0       hms_1.1.0           pillar_1.6.1       
## [64] reprex_2.0.0        glue_1.4.2          evaluate_0.14      
## [67] latticeExtra_0.6-29 data.table_1.14.0   modelr_0.1.8       
## [70] vctrs_0.3.8         png_0.1-7           cellranger_1.1.0   
## [73] gtable_0.3.0        assertthat_0.2.1    xfun_0.23          
## [76] openxlsx_4.2.3      mime_0.10           survival_3.2-11    
## [79] beeswarm_0.3.1      cluster_2.1.2       ellipsis_0.3.2
