This vignette demonstrates and explains our reasoning during preliminary data analysis method. The aims of this document were to explore the role of ‘site’ on conidial dispersal.
From this data we hope to interpret the following.
In this experiment there are a number of factors which may influence conidia spread. These include:
Experimental location (site).
The time at which a spread event occurred (SpEv).
The factor SpEv would be nested within site.
SpEv factor may describe variation in the data that varies between SpEv, such as weather and climate variables.
Wind speed during the spread event.
Wind direction during the spread event.
Distance the trap plants were placed from the ascochyta infested plots (distance).
The bearing in which the trap plants were placed at distances relative to the infested plots (transect).
How the spread event was initiated, with sprinkler irrigation or rainfall.
The quantity of rainfall.
Due to the lack of replicated pots at some of the distances we will ignore transect as a factor. We know wind direction will influence our results and we will need to accept that adds variation for which we may not be able to account for statistically.
I will start using lmer()
to analyse the mean number of
lesions per plant at each distance. The reps at each distance are
defined by ‘pot’, each pot contains three to five chickpea plants. The
factor distance is fit as a continuous variable.
Site is a categorical variable explaining the trial location. Each
site may have experienced a different number of spread events, defined
by the term SpEv
. Rainfall is required for conidia to
disperse from the infected focus, and each ‘spread event’ constitutes
either an overhead irrigation event or a natural rainfall event.
The first models I will look at are asking:
dat <-
left_join(lesion_counts, summary_weather, by = c("site", "rep"))
mod1 <-
lmer(m_lesions ~ distance + (distance | site / SpEv),
data = dat)
cat("mod1: ")
## mod1:
formula(mod1)
## m_lesions ~ distance + (distance | site/SpEv)
summary(mod1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: m_lesions ~ distance + (distance | site/SpEv)
## Data: dat
##
## REML criterion at convergence: 911.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3523 -0.6015 -0.0971 0.4908 5.8015
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SpEv:site (Intercept) 0.5350421 0.73147
## distance 0.0001785 0.01336 -1.00
## site (Intercept) 0.2071781 0.45517
## distance 0.7845495 0.88575 0.09
## Residual 0.7695721 0.87725
## Number of obs: 334, groups: SpEv:site, 6; site, 3
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.88614 0.41445 4.551
## distance -0.02603 0.51142 -0.051
##
## Correlation of Fixed Effects:
## (Intr)
## distance 0.045
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Let’s examine the model without ‘site’ to test if the model is a worse fit.
mod2 <-
lmer(m_lesions ~ distance +
(distance | SpEv),
data = dat)
## boundary (singular) fit: see help('isSingular')
cat("mod2: ")
## mod2:
formula(mod2)
## m_lesions ~ distance + (distance | SpEv)
# Compare models
anova(mod1, mod2)
## refitting model(s) with ML (instead of REML)
## Data: dat
## Models:
## mod2: m_lesions ~ distance + (distance | SpEv)
## mod1: m_lesions ~ distance + (distance | site/SpEv)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mod2 6 889.71 912.58 -438.85 877.71
## mod1 9 895.45 929.75 -438.72 877.45 0.2606 3 0.9673
A comparison of the two models shows us that mod2
is
much better fit given the lower AIC and that there is no significant
difference in the models. Following a reductive approach we should
remove site from the model.
summary(mod2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: m_lesions ~ distance + (distance | SpEv)
## Data: dat
##
## REML criterion at convergence: 889.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3652 -0.5628 -0.1261 0.4583 5.7863
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SpEv (Intercept) 0.9534636 0.97645
## distance 0.0002697 0.01642 -1.00
## Residual 0.7698005 0.87738
## Number of obs: 334, groups: SpEv, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.963345 0.406414 4.831
## distance -0.027052 0.006974 -3.879
##
## Correlation of Fixed Effects:
## (Intr)
## distance -0.986
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
We can also note that as the distance increases there are less mean lesions per pot, and the variance increases.
From here we should continue with a generalised additive model (GAM), which can handle non-linear terms better than a linear model.