Bivariate descriptive statistics - categorical variables

Bivariate categorical data can be described using:

a) Contingency Table

b) Contingency Coefficients

c) Cumulative Bar Chart

d) Bar Chart

e) 3-D Bar Chart



Contingency Table

Tables of counts and percentages (row or column) of the joint distribution of two variables that each have only a few distinct values are known as contingency, cross-tabulation, or crosstab tables.

Primary GCSE A levels University
Male 5 34 176 62
Female 12 49 252 36

Primary GCSE A levels University
Male 1.81 12.27 63.54 22.38
Female 3.44 14.04 72.21 10.32


Contingency Coefficients

Contingency coefficients are measures of association of two nominal variables. It ranges between 0 (no relationship) and 1 (perfect relationship). Most commonly used coefficients are Pearson's Contingency Coefficient and Cramer's V (1946). Lambda is another measure of association for cross tabulations of nominal-level variables. It measures the percentage improvement in predictability of the dependent variable (row variable or column variable), given the value of the other variable (column variable or row variable). If the value of lambda with columns dependent equals 0.5, this means that there is a 50% reduction in error when row variable is used to predict column variable. Symmetric Lambda is a weighted average of the Lambda with rows dependent and Lambda with columns dependent.




Cumulative Bar Chart

Cumulative Bar Chart visualizes crosstable of percentages.

Cumulative Bar Chart




Bar Chart

Bar Chart visualizes counts from crosstable.

Bar Chart




3-D Bar Chart

3-D Bar Chart is an alternative to Bar Chart.

3-D Bar Chart