Title: Multivariate Data Exploration with Stata:
1Multivariate Data Exploration with Stata
Stephen Soldz Boston Graduate School of
Psychoanalysis ssoldz_at_bgsp.edu
2Principal Components Analysis Purpose Data
exploration and data reduction
- Available in Stata
- Base ado (pca)
- Built-in (factor, pcf)
- score will produce component scores
- Issues/Limitations
- pca just a wrapper for (now undocumented) pc
option to factor, which user cannot access and
modify - Confusing documentation on difference between pca
and factor, pcf (i.e., scaling of eigenvectors) - Does not directly allow pca of correlation/
covariance matrix must use corr2data,
introducing error - Does not allow rotate to protect user seems
patronizing and uncharacteristic of Stata
3Exploratory Factor Analysis Purpose Data
exploration and data reduction
- Available in Stata
- Built-in factor allows principal factors (with
and without iteration of communalities), maximum
likelihood - Built-in rotate allows varimax (with and without
Horst correction) and promax
- Issues/Limitations
- factor, pfi (prinipal factors with iteration)
does not allow specification of number of times
to iterate this directly conflicts with Gorsuch
(1983) recommendation that communalities be
iterated only 3-4 times - As factor built-in, users cannot modify or build
on it rotate options very limited (only varimax
and promax) and users cannot modify, though they
could access eigenvectors (matix_get) and write
their own
4Exploratory Factor Analysis, Continued
- Issues/Limitations
- rotate not well documented, so not clear if one
could, e.g., rotate canonical correlations as
suggested by Cliff Krus (1976).
- Available in Stata
- Built-in factor allows principal factors (with
and without iteration of communalities), maximum
likelihood - Built-in rotate allows varimax (with and without
Horst correction) and promax
5Correspondence Analysis Purpose Data
exploration and reduction of categorical data
- Available in Stata
- User-written coranal (correspondence analysis)
- User-written mca (multiple correspondence
analysis)
- Issues/Limitations
- Graphics broken in Stata 8
- Statalist question as to whether mca is producing
correct output - Few variations implemented
6Optimal Scaling Purpose Data exploration,
reduction, and transformation
- Available in Stata
- None (that Im aware of)
7Multidimensional Scaling Purpose Data
exploration
- Available in Stata
- None (that Im aware of)
8Conclusion
- Stata is weak inmultivariate exploratory data
analysis procedures. - Many existing procedures are inflexible and not
extensible, or user-contributed and not currently
maintained. - Stata lags behind SPSS, SAS, S-Plus, and R in
this area.