Tesfayi Gebre wrote
> I want to generate a single unique id to merge files
> by using variables: var1 (F1) var2 (F3) var3 (F4) and
> var4 (F2), where F shows the integer width (f format).
> So, I used
>
> gen id = var1*10^9 + var2*10^6 + var3*10^2 + var4
>
> This is the output from stata and the true id (as it
> should be) for the following
> example:
>
> var1 var2 var3 var4 id (stata) id (true)
> 1 2 3452 1 1005452032 1002345201
> 1 121 34 10 1121033984 1121003410
> 2 23 156 2 2023155968 2023015602
> 2 45 3 2 2045003008 2045000302
> 3 6 4 3 3006003968 3006000403
> 3 70 8 5 3070008064 3070000805
>
> Why is the stata id different from the true id? Does
> stata store numbers in different format?
Generate your id variable as a double numerical type. Doubles have up to 16
digits of accuracy, i.e.
gen double id = var1*10^9 + var2*10^6 + var3*10^2 + var4
I presume your default datatype is -float- so values were not accurately
rendered when you generated id. Floats only have approx 7.22 digits of
accuracy. See http://www.stata.com/support/faqs/data/prec.html and -help
datatype- for more info.
Patrick Joly
[email protected]
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/