Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: using the 'real' command


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: using the 'real' command
Date   Thu, 9 Dec 2004 13:31:22 -0000

A bit more thought, and I have 
to say that I am probably giving 
some good and some bad advice here. 

-egen, group()- is often a good way 
of generating unique identifiers, but 
applied in different datasets it won't 
in general lead to identifiers that 
can be used in merging. 

Nick 
[email protected] 

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Nick Cox
> Sent: 08 December 2004 19:46
> To: [email protected]
> Subject: RE: st: RE: using the 'real' command
> 
> 
> So the beginning and end of the problem 
> is the need for a unique identifier. 
> 
> In this situation, I would try 
> 
> egen id = group(cl0?), label 
> 
> as a somewhat lazier way to climb the mountain. 
> 
> It would seem that you need to do something 
> similar in the other dataset. 
> 
> If your ids are simple integers, the numeric
> format before and the string format after 
> don't sound like an issue. If your ids are not 
> simple integers, you are probably going to 
> get major problems by forcing them to be integers
> when they are really numbers with fractional parts. 
> 
> Specifically, I don't like the look of 
> 
> tostring ... , force 
> 
> As -tostring-'s putative parent (parthenogenesis
> is fun), I underline that -force- is an explicit 
> signal that you know you could lose information 
> when you do this. Any use of -force- prior to 
> a -merge- is inviting trouble, as for a -merge- 
> you really do want your identifiers to be correct and not 
> mangled. 
> 
> Yet further: the -tostring- / -real()- / -tostring- 
> sequence looks fairly weird, especially as -real()- 
> itself can happily play havoc with stuff it 
> doesn't understand. 
> 
> Nick 
> [email protected] 
> 
> [email protected]
> > 
> > thank you for your prompt reply. yes, it appears i neglected 
> > to put the
> > delimiter ... it works fine now but I still have my original 
> > problem of forcing
> > the display format to be %2.0f.
> > 
> > here is the context: I am working with a household survery on 
> > child labor force.
> > I am trying to generate a unique id by concatenating a few 
> > string variables. the
> > reason I am creating the unique id is so that i can merge the 
> > data set with
> > anoterh survey on labor force from the same country.  The 
> > variables (cl01 thru
> > cl08) originally came in numeric format (with a display 
> > format of %2.0f). What I
> > am trying to do is convert these variables to string 
> > variables while keeping the
> > display format.
> > 
> > unfortunately, the display format that I get when i convert 
> > to string is %9.s
> > 
> > this is what my do file looks like:
> > 
> > tostring cl01-cl08, replace force;
> > gen id1=real(cl01); format id1 %02.0f;
> > gen id2=real(cl02); format id2 %02.0f;
> > gen id3=real(cl03); format id3 %02.0f;
> > gen id4=real(cl04); format id4 %02.0f;
> > gen id5=real(cl05); format id5 %02.0f;
> > gen id6=real(cl06); format id6 %03.0f;
> > gen id7=real(cl07); format id7 %02.0f;
> > gen id8=real(cl08); format id8 %02.0f;
> > tostring id1-id8, replace usedisplayformat;
> > egen hhid=concat(id1 id2 id3 id4 id5 id6 id7 id8);
> > 
> > any advice is appreciated

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index