Data reduction and structure detection

Principal Components Analysis, Factor Analysis

The main applications of factor analytic techniques are: (1) to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify variables. Factor analysis is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated. In a research survey on television programme assessment, 750 viewers used 58 rating statements to describe 61 different programmes. Factor analysis reduced the 58 statements to nine factors (main dimensions of thought), which allow viewers to give a rating to any programme. Usually the goal of factor analysis is to aid data interpretation. The factor analyst hopes to identify each factor as representing a specific theoretical factor. One of the subtlest tasks in factor analysis is determining the appropriate number of factors. Factor analysis has an infinite number of solutions. Two researchers can find two different sets of factors that are interpreted quite differently yet fit the original data equally well. Factor analysis requires that the underlying data are distributed as multivariate normal (therefore measurement scale must be interval), and that the relationships are linear.

Characteristic that distinguishes between the two factor analytic models is that in principal components analysis (PCA) we assume that all variability in an item should be used in the analysis, while in principal factors analysis (FA) we only use the variability in an item that it has in common with the other items. PCA is preferred as a method for data reduction, while FA is preferred when the goal of the analysis is to detect structure. Both methods compute correlation matrix of original variables at the beginning and in most cases usually yield similar results.