Analysis of dependent interval variable Y on independent nominal variables X

Multifactor analysis of variance (ANOVA)

The purpose of analysis of variance is to test differences in means (for groups or variables) for statistical significance. This is accomplished by analyzing the variance, that is, by partitioning the total variance into the component that is due to true random error (within-group) and the components that are due to differences between means. The F-ratio is used to determine statistical significance. The tests are nondirectional in that the null hypothesis specifies that all means for a specified main effect or interaction are equal and the alternative hypothesis simply states that at least one is different. Statistically significant factor sorts dependent variable Y into groups so that mean of at least one group differs from the mean of the other groups. ANOVA also allows us to detect interaction effects between variables (factors), and, therefore, to test more complex hypotheses about reality.

A linear mathematical model for a two-factor ANOVA could be as follows:

Yijk= µ + ai + bj + (ab)ij + eijk

where
µ - the mean.
ai - the contribution of the ith level of a factor A.
bj - the contribution of the jth level of a factor B.
(ab)ij - the combined contribution of the ith level of a factor A and the jth level of a factor B.
eijk - the contribution of the kth individual (the error)

Assumptions

Normality
The eijk follow the normal probability distribution with mean equal to zero.

Homogeneity of variances
The variances of the eijk are equal for all values of i, j, and k.

The individuals are independent
This means that observations in groups are different objects.

If the dependent variable is multi-dimensional (e.g. 3 different indicators of success) MANOVA must be used.

Example: Is there any relationship between number of products sold by commercial representative (Y) and his/her age (X1), gender (X2) and education (X3)?