Multivariate Anova part 2

Previously, introduction to the Multivariate Anova, and Multivariate Visual Error Oval.

We continue our exploration of a simple multivariate anova by looking at different effect sizes of the factors and different correlations between the variables. We have seen that that two apparently insignificant univariate anovas can be shown by a multivariate anova to have masked differences which are in fact quite significant. This is because the multivariate analysis takes the correlation between the variables into account, something that multiple univariate analyses cannot do. In this particular case, the positive correlation between the variables sets an expectation that any differences between the variable means should also show a positive relationship, and when the data shows that the differences have a negative relationship it is detected as significant.

Our next enquiry concerns the effect size which does yield a significant univariate result when the differences between the Group (Control and Treatment) means show the same relationship as would be expected from the correlation between the variables.

Consistent centroid trend (r = .71), partially significant univariate, significant multivariate

We take the data of Table 1 from the page introducing the Multivariate Anova and change the effect size of our elixir to increase the mean Confidence of the Treatment group from 1 to 3, as per Table 1 below.

Table 1. Data and descriptive statistics

Control			Treatment
Case	Confidence	Test	Case	Confidence	Test
1	8	42	10	11	41
2	10	47	11	10	47
3	13	67	12	14	56
4	10	63	13	15	62
5	10	52	14	13	53
6	12	46	15	13	43
7	7	35	16	14	65
8	8	46	17	11	59
9	12	52	18	16	69
Mean	10.0	50.0		13.0	55.0
StDev	2.06	9.97		2.00	9.81

The Treatment group has a higher mean Confidence and higher mean Test score than the Control. The effect sizes are shown in Table 2.

Table 2. Effect sizes

	Confidence	Test
Difference	3.0	5.0
Pooled s	2.50	9.94
Effect size (d)	1.20	0.51

While the Test score difference is half a standard deviation in size, the Confidence difference is 1.2 standard deviations in size. The scattergram for this data is shown in Figure 1.

Figure 1. Scattergram of the data of Table 1

Visually, Figure 1 paints a picture of considerable separation of the Treatment group markers from the Control group markers, mainly along the Confidence axis. Notice that the centroid trend line is similarly positive and consistent with the trend lines of the group data, and that the centoids are further apart than seen in Figure 3 of the earlier Multivariate Anova discussion. Note that the visual error ovals (see Multivariate Visual Error Oval) do not overlap. The multivariate test is shown in Table 3.

Table 3. Multivariate test of Table 1 data

Multivariate Test
Pillai's Trace
Effect	Value	F	Hyp df	Err df	p
Treat_vs_Cont	0.44	5.82	2	15	0.01

This is significant. There is a statistically significant Treatment effect when considering Confidence and Test score together, which allows us to examine the elixir Treatment effect on Confidence and on Test score considered separately. This is shown in Table 4.

Table 4. Univariate tests of Table 1 data

Univariate Tests
Source		SS	df	MS	F	p
Treat_vs_Cont	Conf	40.50	1	40.50	9.82	0.01
Treat_vs_Cont	Score	112.50	1	112.5	1.15	0.30
Error	Conf	66.00	16	4.13
Error	Score	1566.0	16	97.88

There are no surprises here and the interpretation of the results is straightforward — the statistically significant Treatment effect when considering Confidence and Test score together is shown to be mainly due to a statistically significant Treatment effect on Confidence (effect size = 1.20), p = .01; the Treatment effect on Test score is not significant (effect size = 0.51), p = .30.

Consistent centroid trend (r = .71), significant univariate, insignificant multivariate

Our next enquiry concerns the situation where there are significant univariate results yet the multivariate analysis stubbornly refuses to declare a significant result. As before, we reengineer our data to have "large" effect sizes while keeping the same pattern of correlation for both the data points and the mean data centroids. We skip the raw data table and show the effect size table only, as in Table 5, and then show the scattergram in Figure 2.

Table 5. Effect sizes

	Confidence	Test
Difference	2.0	10.0
Pooled s	2.22	10.89
Effect size (d)	0.90	0.92

Figure 2. Scattergram of the data summarised in Table 5 as derived from the data of Table 1

Notice the centroid trend line, which is giving us a visual representation of the two-variable "distance" between the Treatment and Control groups. In particular, notice that the length of this trend line is very similar to the length of the Figure 1 centroid trend line, where the "distance" between Treatment and Control groups was shown to be significant by the multivariate analysis of Table 3.

The Confidence and Test score differences are a little under one standard deviation in size, said by Cohen to be "large". Two large effect sizes should give us two significant univariate results, but do they yield a significant multivariate result? We can anticipate by inspection of the visual error ovals, which seem to have a slight overlap. Table 6 shows the multivariate test, and Table 7 shows the univariate tests.

Table 6. Multivariate test of data illustrated in Figure 2.

Multivariate Test
Pillai's Trace
Effect	Value	F	Hyp df	Err df	p
Treat_vs_Cont	0.25	2.46	2	15	0.12

Table 7. Univariate tests following Table 6 test

Univariate Tests
Source		SS	df	MS	F	p
Treat_vs_Cont	Conf	18.00	1	18.00	4.36	0.05
Treat_vs_Cont	Score	450.00	1	450.0	4.60	0.05
Error	Conf	66.00	16	4.13
Error	Score	1566.0	16	97.88

Although we have the expected two significant univariate tests, p = 0.05 each, the multivariate test fails to reach significance, p = 0.12. The strong positive correlation between the two variables means that, taken together, their centroids are not significantly different even though the differences between their individual means are significantly different. Another way of saying this is that the two groups of data, Treatment and Control, are judged to have come from the same population (are not significantly different) when the measures, Confidence and Test score, are considered together.

Consistent centroid trend (r = .93), significant univariate, insignificant multivariate

This enquiry examines the situation with a very high correlation between the measures, r = 0.93. The data is reengineered to have particularly large effect sizes around 1, giving significant univariate results but an insignificant multivariate test. The effect size table is shown in Table 8, and the scattergram in Figure 3.

Table 8. Effect sizes

	Confidence	Test
Difference	2.41	12.41
Pooled s	2.29	10.01
Effect size (d)	1.05	1.07

Figure 3. Scattergram of data with r = 0.93 and effect sizes summarised in Table 8

We can anticipate the multivariate test result from inspection of the visual error ovals, which seem to have a slight overlap. Table 9 shows the multivariate test, and Table 10 shows the univariate tests.

Table 9. Multivariate test of data illustrated in Figure 3.

Multivariate Test
Pillai's Trace
Effect	Value	F	Hyp df	Err df	p
Exp_vs_Cont	0.31	3.35	2	15	0.06

Table 10. Univariate tests of data illustrated in Figure 3.

Univariate Tests
Source		SS	df	MS	F	p
Exp_vs_Cont	Conf	26.16	1	26.16	6.62	0.02
Exp_vs_Cont	Score	693.16	1	693.2	6.92	0.02
Error	Conf	63.21	16	3.95
Error	Score	1602.3	16	100.1

Despite the two univariate tests being significant, p = .02 each, the multivariate test declares that the Treatment and Control groups cannot be shown to have come from different populations when the measures are considered together, p = .06. As before, we can attribute this result to the exceptional correlation between the measures being matched by the trend of the centroid trend line which is exactly consistent with the correlation.

When r = 0

Our next enquiry concerns the situation when there is negligible correlation between the measures. We reengineer our data to have "large" effect sizes. Figure 3 shows the scattergram.

Figure 4. Scattergram of data with visual error ovals for r=0

For this data, the multivariate and univariate tests are shown in Tables 11 and 12.

Multivariate Test
Pillai's Trace
Effect	Value	F	Hyp df	Err df	p
Exp_vs_Cont	0.36	4.24	2	15	0.03

Table 11. Multivariate test of Figure 4 data

The multivariate test shows that the two group centroids are significantly different, p = .03, and indeed the visual error ovals show no overlap. For interest, the univariate tests are shown in Table 12, where the differences in mean Confidence and mean Test score are also significant, and to a similar degree, p = 0.05 and 0.05, as that seen for the multivariate test of the centroids.

Univariate Tests
Source		SS	df	MS	F	p
Exp_vs_Cont	Conf	18.40	1	18.40	4.57	0.05
Exp_vs_Cont	Score	449.00	1	449.0	4.49	0.05
Error	Conf	64.36	16	4.02
Error	Score	1598.7	16	99.92

Table 12. Univariate tests of Figure 4 data

When r is negative, inconsistent centroid trend

We check our original finding (Figure 2 in Multivariate Anova) that the multivariate analysis is a sensitive test of a centroid trend that is contrary to the correlation seen in the group data, even when the correlation is negative. The example data has been reengineered to yield the scattergram of Figure 5, an approximately mirror image of the earlier scattergram.

Figure 5. Scattergram with group r ≈ –.72, effect sizes 0.50

The results of the multivariate and univariate tests are given in Tables 13 and 14.

Table 13. Multivariate test of Figure 5 data

Multivariate Test
Pillai's Trace
Effect	Value	F	Hyp df	Err df	p
Exp_vs_Cont	0.35	3.97	2	15	0.04

As before, there is a statistically significant Treatment effect considering Confidence and Test score together, p = .04.

Table 14. Univariate tests of Figure 5 data

Univariate Tests
Source		SS	df	MS	F	p
Exp_vs_Cont	Conf	4.50	1	4.50	1.29	0.27
Exp_vs_Cont	Score	107.56	1	107.6	1.08	0.31
Error	Conf	56.00	16	3.50
Error	Score	1598.9	16	99.93

The elixir has no significant effects on Confidence or Test score when these measures are considered separately, p = .27 and .31 respectively. The multivariate analysis tests and flags a result that cannot be tested or flagged by univariate tests — whether the Treatment effect is consistent with, or the opposite of, what would be expected given the correlation between the measures.

Inconsistent group correlation

Finally, we explore the most unlikely of data, where one group shows a positive correlation between the measures, and the other shows the opposite. Our data is engineered to have r = 0.71 in the Control group data as may be seen in Figure 2 and most of the other scattergrams, and r = –0.74 in the Treatment group as seen above in Figure 5. The effect sizes are 0.5. The scattergram is shown in Figure 6.

Figure 6. Scattergram with Control group r = .71, Treatment group r = – .74

The overlap of the visual error ovals suggests the multivariate test will be insignificant. It is shown in Table 15 with the univariate tests in Table 16.

Table 15. Multivariate test of the data in Figure 6.

Multivariate Test
Pillai's Trace
Effect	Value	F	Hyp df	Err df	p
Exp_vs_Cont	0.12	1.05	2	15	0.37

As anticipated, the multivariate test is not significant; the centroids are not significantly different, p = 0.37.

Table 16. Univariate tests of the data in Figure 6.

Univariate Tests
Source		SS	df	MS	F	p
Exp_vs_Cont	Conf	4.50	1	4.50	1.13	0.30
Exp_vs_Cont	Score	112.50	1	112.5	1.13	0.30
Error	Conf	64.00	16	4.00
Error	Score	1590.0	16	99.38

We have seen this data with these effect sizes before, and they routinely yield insignificant univariate tests with each p in the region of 0.30 (slightly different values elsewhere due to rounding errors). We can see that the multivariate test p and the univariate test p's are similar in magnitude.

To round off our exploration of multivariate testing so far, it is interesting to run an analysis where r = 0 for both groups and the effect sizes are 0.50 as seen before. Unsurprisingly, the multivariate test returns p = 0.37, and the univariate tests return p = 0.30 and p = 0.31. Wait. We have just seen these more or less identical results when the r for Control was .71 and the r for Treatment was –.74. What is interesting is that the multivariate analysis treats conflicting correlation in the groups as cancelling each other out to give an effective "overall" r of 0. Indeed, when the raw data for the two groups are combined, overall r = 0.07, and when simply averaged (§1), r = –0.02.

(§1) In general, the correlation seen between the measures in any one group will be different from that seen in a different group. To provide a best estimate of the population correlation between the measures, the multivariate analysis averages the correlation seen in each group to give an "overall" estimate. This is one of the assumptions of a multivariate analysis of variance — the assumption of homogeneity of measure correlations across the groups.

Next, Multivariate Anova Part 3, Multivariate Anova part 4.