library(tidyverse)

Import the data

ble <- read.table("../data/ble.txt", header=TRUE, sep=";", dec=".")
head(ble)

Perform a 1-factor ANOVA

ggplot(ble, aes(x=variete, y=rdt)) + geom_boxplot() +
  ggtitle("Whisker boxes") + xlab("Wheat variety") + ylab("Yield")

ggplot(ble, aes(x=phyto, y=rdt)) + geom_boxplot() +
  ggtitle("Boxplot") + xlab("Phytosanitary treatment") + ylab("Yield")

Anova test on wheat variety

anova_variete <- lm(rdt ~ variete, data=ble)
summary(anova_variete)
## 
## Call:
## lm(formula = rdt ~ variete, data = ble)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -344.20  -69.30   -6.60   89.15  329.90 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5633.80      26.30 214.211  < 2e-16 ***
## varieteV2     -49.70      37.19  -1.336  0.18546    
## varieteV3    -169.20      37.19  -4.549    2e-05 ***
## varieteV4     118.40      37.19   3.183  0.00211 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 117.6 on 76 degrees of freedom
## Multiple R-squared:  0.4476, Adjusted R-squared:  0.4258 
## F-statistic: 20.53 on 3 and 76 DF,  p-value: 7.674e-10

p-value is very small, there is an actual influence of the wheat variety on the produced yield. We thereofre reject the null hypothesis that variety has no impact of yield. p-values of each variety are with respect of the first modality. Indeed, there is no significative difference in yield between variety 1 and 2, while 3 and 4 behave differently from variety 1.

anova(anova_variete)

Anova test on pesticide

anova_phyto <- lm(rdt ~ phyto, data=ble)
summary(anova_phyto)
## 
## Call:
## lm(formula = rdt ~ phyto, data = ble)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -337.12 -127.95   -4.17  106.03  341.88 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5612.23      24.69 227.291   <2e-16 ***
## phytoSans      -7.10      34.92  -0.203    0.839    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 156.2 on 78 degrees of freedom
## Multiple R-squared:  0.0005297,  Adjusted R-squared:  -0.01228 
## F-statistic: 0.04134 on 1 and 78 DF,  p-value: 0.8394

Big p-value, we do not reject the hypothesis that use of pesticide have no influence on the produced yield. No significative difference between the yield produced with or without pesticides.

anova(anova_phyto)

Tests are also significative (good results) as data are balanced (20 entries for each variety, half with and half without pesticides).

Perform a 2-factor ANOVA

anova_variete_phyto <- lm(rdt ~ variete * phyto, data=ble)
summary(anova_variete_phyto)
## 
## Call:
## lm(formula = rdt ~ variete * phyto, data = ble)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -329.80  -67.45   -8.20   76.28  339.50 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          5628.10      38.09 147.772  < 2e-16 ***
## varieteV2             -34.40      53.86  -0.639  0.52507    
## varieteV3            -167.60      53.86  -3.112  0.00267 ** 
## varieteV4             138.50      53.86   2.571  0.01219 *  
## phytoSans              11.40      53.86   0.212  0.83298    
## varieteV2:phytoSans   -30.60      76.17  -0.402  0.68908    
## varieteV3:phytoSans    -3.20      76.17  -0.042  0.96661    
## varieteV4:phytoSans   -40.20      76.17  -0.528  0.59930    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 120.4 on 72 degrees of freedom
## Multiple R-squared:  0.4512, Adjusted R-squared:  0.3979 
## F-statistic: 8.458 on 7 and 72 DF,  p-value: 1.622e-07
anova(anova_variete_phyto)

Big p-values for phyto and variete:pytho: use of pesticide do not have any impact on yield, as well as the interaction between use of pesticides and variety.