Analysis of Latin Square Design using R

A Latin square design is a design in which two gradients are controlled with crossed blocks, but in each intersection there is only one treatment level. The major feature of the Latin square design is its capacity to simultaneously handle two known sources of variation among experimental units. Latin square design gives two independent blocking criteria unlike randomized complete block design that treats only one blocking criteria. Horizontal blocking is referred as row blocking and vertical blocking is referred as column blocking. This type of two directional blocking is accomplished by ensuring that each treatment appears only once in each row block and in each column block. This method measures estimated variation among row and column block to considerably reduce from experimental error.

Conditions for Latin Square design

Latin square design can be used in following conditions:

Field trials where fertility gradient exists in two directions perpendicular to each other, or has a unidirectional fertility gradient but also has residual effects from previous trials.
Insecticide field trials where the insect migration has a predictable direction that is perpendicular to the dominant fertility gradient of the experimental field.
Greenhouse trials in which the experimental pots are arranged in straight line perpendicular to the glass or screen walls, such that the difference among rows of pots and the distance from the glass wall are expected to be the two major sources of variability among the experimental pots.
Laboratory trials with replication over time, such that the difference among experimental units conducted at the same time and among those conducted over time constitute the two known sources of variability.
Useful in animal nutrition studies. As in nutrition trials on dairy cattle, only a few cows may be available for financial reasons.

Restrictions

Latin square design has following restrictions:

The requirement of Latin square design that all treatment appears only once in each row and column block also becomes a major restriction. The experiment becomes impractical if the number of treatment is very large because of large number of replications required.
On the other hand if the number of treatment is small, the degree of freedom associated with experimental error becomes too small for the error to be reliably estimated.
In agricultural experiments where the land requirement is rigid then the actual layout in the field is laborious and approach to the central plots becomes difficult.
Due to these restrictions the Latin square design is practically being used in experiments only where the number of treatments is not less than four and not greater than eight. Despite of its great potential for controlling experimental error this design is not being used widely in agricultural experiments.
If there are missing observations in the experiment then the analysis becomes complicated.

Randomization of Latin Square Design

Before carrying out an experiment, the design should be randomized with the restriction that each treatment occurs once within each row and once within each column. First you need to randomize the order of the row, then the order of the column and finally assign treatments.

Randomize the order of rows

For this purpose draw a square with letters in alphabetical order. Randomize the order of rows using random numbers. According to the ranked order of rows;

4th row is placed 1st
3rd row is placed 2nd
1st row is placed 3rd and
2nd row is placed 4th

Random numbers	Ranked order of rows
0.910	4
0.843	3
0.324	1
0.679	2

Before randomizing rows

A	B	C	D
B	C	D	A
C	D	A	B
D	A	B	C

After randomizing rows

D	A	B	C
C	D	A	B
A	B	C	D
B	C	D	A

Randomize the order of columns

Now randomize the columns. According to the random numbers the ranked order for columns is 22, 33, 11 and 44. So according to the ranked order of rows;

2nd column is placed as 1st
3rd column is placed as 2nd
1st column is placed as 3rd and
4th column is placed as 4th

Random numbers	Ranked order of columns
0.628	2
0.871	3
0.158	1
0.947	4

Before randomizing columns

D	A	B	C
C	D	A	B
A	B	C	D
B	C	D	A

After randomizing columns

A	B	D	C
D	A	C	B
B	C	A	D
C	D	B	A

Randomize the order of treatments

Now the turn is to randomly assign treatments. First assign treatments to the letters according to the random numbers. According to the random numbers the treatments rank is 11, 44, 33 and 22.

Random numbers	Ranked order of treatments
0.039	1
0.718	4
0.569	3
0.182	2

Assigning treatments labels randomly

A	B	C	D
T1	T4	T3	T2

Treatments allocation in Latin Square

T1	T4	T2	T3
T2	T1	T3	T4
T4	T3	T1	T2
T3	T2	T4	T1

Importing data

Suppose we have a data which is obtained from experimental area that has two directional gradient in fertility of the soil. The data shows yield of four varieties of wheat arranged in a four by four Latin square design. To do analysis first you need to import data in R. Before importing the data set, I often recommend to first clear all the objects or values in global environment using remove() function. Shut down all open graphics devices using graphics.off() function. Clear everything in console using system command within shell() function.

rm(list = ls(all = TRUE))
graphics.off()
shell("cls")

Next step is importing the data set. Suppose we have a data which is obtained from experimental area that has two directional gradient in fertility of the soil. The data shows yield of four varieties of wheat arranged in a four by four Latin square design. Load the package readxl by using library() function. To import the data from excel spreadsheet use read_excel() function. In argument path provide the link of the file. Type TRUE for argument col_names if the file contains first row as variable names. Your data are now available in the R console. They are stored in the object which I have chosen to call data.

library(readxl)

data = read_excel(path = "E:/Youtube Channel/Agron Info Tech/Lectures/Data Analysis/23 LSD analysis in R/data-LS Design.xlsx",
                  col_names = TRUE)

Viewing data

You can visualize them by typing view(). You can also type head() or tail() function to display only the beginning or the end of the data set. These functions cannot be used to edit the data. The instruction fix() opens a small spreadsheet in R, which can be used to visualize and edit the data.

head(data)

# # A tibble: 6 x 4
#     Row Column Varieties Yield
#   <dbl>  <dbl> <chr>     <dbl>
# 1     1      1 B          1.64
# 2     1      2 D          1.21
# 3     1      3 C          1.42
# 4     1      4 A          1.34
# 5     2      1 C          1.48
# 6     2      2 A          1.18

Verify the variables structure

To verify the structure of the data, str() function is used. It gives information whether the variables are being read as character, number, integer or factor.

str(data)

# tibble [16 x 4] (S3: tbl_df/tbl/data.frame)
#  $ Row      : num [1:16] 1 1 1 1 2 2 2 2 3 3 ...
#  $ Column   : num [1:16] 1 2 3 4 1 2 3 4 1 2 ...
#  $ Varieties: chr [1:16] "B" "D" "C" "A" ...
#  $ Yield    : num [1:16] 1.64 1.21 1.42 1.34 1.48 ...

In the structure of the variables we can see that the variables row, column and varieties (Treatment variable) are being read as character instead of factor. We can change it to factor by using as.factor() command. The function attach() gives direct access to the variables of a data frame by typing the name of a variable as it is written on the first line of the file.

data$Row <- as.factor(data$Row)
data$Column <- as.factor(data$Column)
data$Varieties = as.factor(data$Varieties)
str(data)

# tibble [16 x 4] (S3: tbl_df/tbl/data.frame)
#  $ Row      : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
#  $ Column   : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
#  $ Varieties: Factor w/ 4 levels "A","B","C","D": 2 4 3 1 3 1 4 2 1 3 ...
#  $ Yield    : num [1:16] 1.64 1.21 1.42 1.34 1.48 ...

attach(data)

Apply Latin Square model

To apply Latin Square Design model let’s define an object model which is assigned with linear model function lm() where the argument formula is specified as Yield or response variable separated by (using tilt ~) row, column and varieties. The output can be obtained using anova() or summary() function. The analysis of variance table showed that varieties differ significantly regarding the grain yield or response variable. However, there is also highly significant difference in yield due to column blocking.

model <- lm(formula = Yield ~ Row + Column + Varieties)
anova(model)

# Analysis of Variance Table
# 
# Response: Yield
#           Df  Sum Sq  Mean Sq F value   Pr(>F)   
# Row        3 0.03015 0.010052  0.4654 0.716972   
# Column     3 0.82734 0.275781 12.7692 0.005148 **
# Varieties  3 0.42684 0.142281  6.5879 0.025092 * 
# Residuals  6 0.12958 0.021597                    
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean separation test

The information in analysis of variance table does not identify the specific pairs or groups of varieties that differed. For example, the F test is not able to answer the question of whether every one of the three varieties gave significantly higher yield than that of the check variety or whether there is significant difference among the three varieties. To answer these questions you should go for mean comparison tests.

Now let’s go deeper and see the performance of these varieties by applying suitable mean separation test. Let’s apply Least Significance Difference test to see which variety outperformed regarding grain yield. To apply LSD test first load the library agricolae using library() function. For multiple comparison of treatments use LSD.test() function. In y argument set the value by specifying model or typing the response variable name. Type the variable name in quotations while setting the value for trt argument.

If you type model(aov or lm) in y argument then the variable name for trt argument should be written in quotations else quotations are not required

library(agricolae)
# LSD test
LSD.test(y = model,
         trt = "Varieties",
         DFerror = model$df.residual,
         MSerror = deviance(model)/model$df.residual,
         alpha = 0.05,
         group = TRUE,
         console = TRUE)

# 
# Study: model ~ "Varieties"
# 
# LSD t Test for Yield 
# 
# Mean Square Error:  0.0215974 
# 
# Varieties,  means and individual ( 95 %) CI
# 
#     Yield       std r       LCL     UCL   Min   Max
# A 1.46375 0.2386900 4 1.2839503 1.64355 1.185 1.670
# B 1.47125 0.2095382 4 1.2914503 1.65105 1.290 1.665
# C 1.06750 0.4426153 4 0.8877003 1.24730 0.660 1.475
# D 1.33875 0.1795538 4 1.1589503 1.51855 1.180 1.565
# 
# Alpha: 0.05 ; DF Error: 6
# Critical Value of t: 2.446912 
# 
# least Significant Difference: 0.2542752 
# 
# Treatments with the same letter are not significantly different.
# 
#     Yield groups
# B 1.47125      a
# A 1.46375      a
# D 1.33875      a
# C 1.06750      b

The results showed that varieties B, A and D were statistically at par and yielded more than variety C.

Courtesy: AGRON stats