library(tidyverse)
ble <- read.table("../data/ble.txt", header=TRUE, sep=";", dec=".")
head(ble)
ggplot(ble, aes(x=variete, y=rdt)) + geom_boxplot() +
ggtitle("Whisker boxes") + xlab("Wheat variety") + ylab("Yield")
ggplot(ble, aes(x=phyto, y=rdt)) + geom_boxplot() +
ggtitle("Boxplot") + xlab("Phytosanitary treatment") + ylab("Yield")
anova_variete <- lm(rdt ~ variete, data=ble)
summary(anova_variete)
##
## Call:
## lm(formula = rdt ~ variete, data = ble)
##
## Residuals:
## Min 1Q Median 3Q Max
## -344.20 -69.30 -6.60 89.15 329.90
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5633.80 26.30 214.211 < 2e-16 ***
## varieteV2 -49.70 37.19 -1.336 0.18546
## varieteV3 -169.20 37.19 -4.549 2e-05 ***
## varieteV4 118.40 37.19 3.183 0.00211 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 117.6 on 76 degrees of freedom
## Multiple R-squared: 0.4476, Adjusted R-squared: 0.4258
## F-statistic: 20.53 on 3 and 76 DF, p-value: 7.674e-10
p-value is very small, there is an actual influence of the wheat variety on the produced yield. We thereofre reject the null hypothesis that variety has no impact of yield. p-values of each variety are with respect of the first modality. Indeed, there is no significative difference in yield between variety 1 and 2, while 3 and 4 behave differently from variety 1.
anova(anova_variete)
anova_phyto <- lm(rdt ~ phyto, data=ble)
summary(anova_phyto)
##
## Call:
## lm(formula = rdt ~ phyto, data = ble)
##
## Residuals:
## Min 1Q Median 3Q Max
## -337.12 -127.95 -4.17 106.03 341.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5612.23 24.69 227.291 <2e-16 ***
## phytoSans -7.10 34.92 -0.203 0.839
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 156.2 on 78 degrees of freedom
## Multiple R-squared: 0.0005297, Adjusted R-squared: -0.01228
## F-statistic: 0.04134 on 1 and 78 DF, p-value: 0.8394
Big p-value, we do not reject the hypothesis that use of pesticide have no influence on the produced yield. No significative difference between the yield produced with or without pesticides.
anova(anova_phyto)
Tests are also significative (good results) as data are balanced (20 entries for each variety, half with and half without pesticides).
anova_variete_phyto <- lm(rdt ~ variete * phyto, data=ble)
summary(anova_variete_phyto)
##
## Call:
## lm(formula = rdt ~ variete * phyto, data = ble)
##
## Residuals:
## Min 1Q Median 3Q Max
## -329.80 -67.45 -8.20 76.28 339.50
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5628.10 38.09 147.772 < 2e-16 ***
## varieteV2 -34.40 53.86 -0.639 0.52507
## varieteV3 -167.60 53.86 -3.112 0.00267 **
## varieteV4 138.50 53.86 2.571 0.01219 *
## phytoSans 11.40 53.86 0.212 0.83298
## varieteV2:phytoSans -30.60 76.17 -0.402 0.68908
## varieteV3:phytoSans -3.20 76.17 -0.042 0.96661
## varieteV4:phytoSans -40.20 76.17 -0.528 0.59930
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 120.4 on 72 degrees of freedom
## Multiple R-squared: 0.4512, Adjusted R-squared: 0.3979
## F-statistic: 8.458 on 7 and 72 DF, p-value: 1.622e-07
anova(anova_variete_phyto)
Big p-values for phyto
and variete:pytho
: use of pesticide do not have any impact on yield, as well as the interaction between use of pesticides and variety.