https://gabridego.github.io/MoSIG-SMPE-2022/
This project is maintained by gabridego
Different notions of intervals (confidence, credibility, fluctuation, …), different meanings but sometimes are computed in the same way. Intuitively, the more experiments we make the closest we get to the true value. Tradeoff between precision and cost, but we can still be unlucky in the results we obtain.
The more experiment is controlled, the lower the variance, but bias is introduced.
Does not assume any particular data distribution, just that mean and variance is defined. We can use sample variance, not true variance. If an upper bound to the true variance is known, it should be used.
For most distribution, convergence in term of area to a gaussian after 30 repetitions. With less than 30, Student’s t-test, but assumes normal distributions.
Things should be as independent as possible, use randomization (flipping a coin).
How to handle outliers? The definition of outlier depend on the context. Data can be safely removed if we know something went wrong during the experiment and that data point does not represent the real behavior. Otherwise, it’s cherrypicking. Only drop if there is a good reason to do it.
Sample mean is very sensitive to outliers.