A one-sample t-test is used to see if the mean of a population from which a sample was taken differs statistically from a hypothesised value.
The null hypothesis in a t-test is that the population mean is equal to the hypothesised value, while the alternative hypothesis is that it is not. A two-tailed t-test is what this is known as.
If we have a previous conviction that the population mean is bigger or smaller than the hypothesised value, we can use a one-tailed t-test. The null hypothesis is the same in a one-tailed test, but the alternative hypothesis is that the population mean is higher (or smaller, depending on the case) than the hypothesised value.
Example of one sample t-test in R
Let’s say we have a sample of values as follows:
set.seed(150) data <- data.frame(Value = rnorm(30, mean = 50, sd = 10))
We want to see if the mean of those numbers differs from 50 in any way. The null hypothesis is that the population mean is equal to the hypothesised mean (50), and the alternative hypothesis is that the mean differs from the hypothesised mean (either above or below).
R codes of one sample t-test
To perform the one-sample t-test in R, use the following code:
test <- t.test(data$Value, mu = 50)
Now let’s analyse the output of the test:
> test One Sample t-test data: data$Value t = 0.57321, df = 29, p-value = 0.5709 alternative hypothesis: true mean is not equal to 50 95 percent confidence interval: 47.02585 55.29045 sample estimates: mean of x 51.15815
The p-value is 0.5709, which is above the 5% significance level, therefore the null hypothesis cannot be rejected.
The t-value expresses the magnitude of a difference in comparison to the variation in the sample data. The higher the t-value, the more probable the null hypothesis is to be rejected. The t-value in our example is 0.57, which is a low value.
Degrees of freedom
In a t-test, one degree of freedom is “spent” estimating the mean, so the degrees of freedom will be n-1 – the number of values in the sample minus 1 – which in this case is 29.
95% confidence interval
The 95% confidence interval for our test is 47.03 to 55.29. This means that at the 5% significance level, the null hypothesis cannot be rejected for hypothesised means between 47.03 and 55.29.
Learn Data Science and Machine Learning
Data Analysis Using R/R Studio
- Import data into R
- Principal component analysis (PCA) code
- Canonical correlation analysis (CCA) code
- Independent component analysis (ICA) code
- Cluster Analysis using R
- One-way ANOVA using R
- Two-way ANOVA using R
- Paired sample t-test using R
- Random Forest in R
- Chi-square test using R
- Pearson Correlation in R