Hi,
The reason I cannot use -std()- with egen is because it is not by-able and I need the standarization by nic2. When I do the standardization overall, i.e., over the entire sample of firms I am indeed using egen with std().In doing so, once again one measure of TFP, the one using OLS, shows a near 1 correlation but this measure of TFP does not. In any case, once I standarize the mean of the TFP measure should be zero instead of .54. Also, if I use the stand. by nic2 the other OLS measure of TFP as I mentioned in my earlier mail gives a correlation of 1.000 which is very near to the correlation between the non-std values with the overall std. values for the OLS measure, but not for this measure.
J
--- On Mon, 8/24/09, JIBONAYAN RAYCHAUDHURI <[email protected]> wrote:
> From: JIBONAYAN RAYCHAUDHURI <[email protected]>
> Subject: Standardizing values
> To: [email protected]
> Date: Monday, August 24, 2009, 11:39 AM
> Hi Stata Users,
>
> I am trying to calculate total factor productivity (TFP)
> for a panel of firms. I have these firms classified by
> industry. I have a measure of TFP (tfplevpet_imp, which
> contains imputed values) that I am trying to standarize at
> 2-digit industrial classification (called nic2 that ranges
> from 15 to 35 with some gaps) . I am using the following
> code to do so:
>
> #delimit;
> gen logtfplevpet_mean_imp=.;
> gen logtfplevpet_sd_imp=.;
> #delimit;
> gsort +nic2 +newyear;
> #delimit;
> foreach num of numlist 15 17 19 21 23 24 25 26 27 29 30
> 31 32 35 {;
> by nic2 : egen logtfplevpet_mean_imp`num' =
> mean(logtfplevpet_imp) if nic2==`num';
> by nic2 : egen logtfplevpet_sd_imp`num' =
> sd(logtfplevpet_imp) if nic2==`num';
> };
> #delimit;
> foreach num of numlist 15 17 19 21 23 24 25 26 27 29 30
> 31 32 35 {;
> replace logtfplevpet_mean_imp=logtfplevpet_mean_imp`num' if
> nic2==`num';
> replace logtfplevpet_sd_imp=logtfplevpet_sd_imp`num'
> if nic2==`num';
> };
>
> #delimit;
> gsort +compname +newyear;
> #delimit;
> gen logtfplevpet_stand_imp=
> logtfplevpet_imp-logtfplevpet_mean_imp/logtfplevpet_sd_imp;
>
> The correlation between the standarized and the
> non-standarized values is very low about 0.25. Also, the
> mean of this measure is .54. This measure of TFP is using a
> semi-parametric estimation technique. In another measure of
> TFP, which I get as the OLS residual in a simple regression,
> if I use the exact same code the correlation between std.
> and non std. values is 0.99!!! Also, instead of standarizing
> at 2-digit nic if I do a standardization over the
> entire sample, i..e., std. values are now computed from the
> overall mean and variance of all firms in all industries,
> the TFP measure shows a mean of nearly 0 and an sd of 1, but
> the correlation with the non-std. measure is still low 0.71.
> As a sidenote, when I impute values I am using only those
> nic observations that are used in the standarization. I am
> very puzzled as to why the correlation is so low between
> standard and nonstandard values, when it should always be
> close to 1. Any comments suggestion will
> be highly appreciated.
>
> Jibonayan
>
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam
> protection around
> http://mail.yahoo.com
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/