Adrian,I think it would be a complete travesty to just feed that whole dataset into a factor analysis. Sure, it'll lump together variables with high correlations, but most of the time this doesn't reflect what's going on underneath the data (e.g., a web of diect and indirect causal relations that generated the observed associations/covariance matrix), and this type of situation is what tends to give factor analysis a "bad name" among statisticians. Factor analysis is typically only appropriate for reflective psychometric measures written specifically to assess an underlying trait (e.g., self-esteem, anxiety), not datasets like yours. I think there are probably complex causal relations among your variables that you should think hard about (using your theoretical knowledge about these variables)and maybe come up with a path-analytic model or growth curve model (say, GDP trajectory and its predictors). You could also compare models across countries.My two cents,Cam
----------------------------------------
> From: [email protected]
> To: [email protected]
> Subject: RE: st: RE: Aren't distinct factors from factor analysis or PCA orthogonal to each other?
> Date: Mon, 17 Aug 2009 17:15:33 -0400
>
> Thank you to Cameron, Bob and everybody else for the references.
>
> I have a response to Jay and a couple more questions for everybody, if you can still help me...
>
> Jay wrote:
>> Before you go any further I think you have a big problem to consider: 100 variables on, say 200 countries means you have WAY more covariances (or correlations) than you have countries. This means your correlation matrix is singular.
>
>
> I don't think I have that problem because I don't have 200 countries. I only have about 30+ countries.
>
> However, even if I had 200 countries, I don't understand exactly what the problem would be because I have all 100 variables for country i and all 100 variables for country j stacked on one another. So, I have:
>
> country year GDP inflation reserves
> Argentina 1990 2.3 6.4 100
> Argentina 1991 2.8 7.4 250
> Argentina 1992 2.6 7.0 200
> ...
> Argentina 2006 3.2 8.0 400
> Brazil 1990 1.7 5.4 120
> Brazil 1991 2.1 6.3 140
> Brazil 1992 2.5 7.0 180
> ...
>
>
> So the variables I enter into my factor analysis are GDP, inflation, and reserves... and so the -factor- command in Stata knows nothing about the panel/time-series structure of my data. I can see why it should be relevant to account for the underlying panel structure of the data -- for instance, that jump in GDP/inflation/reserves and any other variables between Argentina in 2006 and Brazil in 1990 may be a bit strange to account for.
>
> So, the first question is: do I need to take this panel structure into account? And if so, how?
>
> The other question is, do units matter? For instance, I know that factor analysis or PCA are all based on a variance-covariance matrix... but if I have two variables, x and y, and I take the covariance between the two of them, that'll be different than if I take the covariance of, say 2x and y:
>
> cov(x,y) <> cov(2x,y)
>
> and so what would happen if I express my GDP in dollars for all countries or in local-currency units?? Or in millions or in billions???
>
>
> Thank you once again.
>
> Best,
> Adrian
>
>
>
>
> _________________________________________________________________
> Hotmail® is up to 70% faster. Now good news travels really fast.
> http://windowslive.com/online/hotmail?ocid=PID23391::T:WLMTAGL:ON:WL:en-US:WM_HYGN_faster:082009
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
_________________________________________________________________
More storage. Better anti-spam and antivirus protection. Hotmail makes it simple.
http://go.microsoft.com/?linkid=9671357
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/