Maoyong Fan wrote:
> I met a strange problem in dealing with my scanner data. I use stata
> transfer to transger the original data to dta files. One of my variables
> is UPC whose storage type is double. I found that when I generate a new
> variable equal to UPC, the new variable is totally different from UPC.
>
> Like this:
> gen UPC1 = UPC
> gen diff = UPC1 - UPC
> sum
> des
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> UPC | 1780770 3.76e+10 1.96e+11 1.00e+10 2.72e+12
> UPC1 | 1780770 3.76e+10 1.96e+11 1.00e+10 2.72e+12
> diff | 1780770 144.9445 4290.722 -105114 116086
>
The default Storage type for new variables in Stata is float. Float variables
have about 7 digits of accuracy; If the values in UPC have more than 7
digits, these values have to be rounded to store them in a float variable.
To get an exact copy of your variable you need to generate UPC1 with double
precision:
. gen double UPC1 = UPC
Note however, that the difference between 1.000.000.000 and 1.000.000.144 is
probably not very important. Equally one is seldom interested in the
difference between numbers like 0,1234567144 and 0,1234567000.
More on that can be found under the heading "Precision of numeric storage
types" in help datatypes and in [U] 16.10
Hope this helps
Uli
--
[email protected]
+49 (030) 25491-361
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/