The way to solve these problems is to avoid them.
Your problem is to check something, not to
calculate something.
The most direct way to see whether values of
-some- variable remain constant in groups of
-block- is
bysort block (some) : assert some[1] == some[_N]
If there is any difference between values, this
will show up as a difference between the first (smallest)
and the last (largest). Conversely, if all values are
the same, the first and last will also be the same.
Nick
[email protected]
Newbie
> I have the following data:
>
> Id date var
> A 1.1.90 10.1
> A 1.2.90 11.2
> A 1.3.90 12.3
> ...
> A 1.11.04 3.1
> A 1.12.04 4.2
> A 1.1.05 4.2
> A 1.2.05 4.2
> A 1.3.05 4.2
> ... (only -date- changes, with var fixed at 4)
> B 1.1.92 100
> B 1.2.92 110
> B 1.3.92 120
> ...
> B 1.11.03 30.1
> B 1.12.03 40.5
> B 1.1.04 40.5
> B 1.2.04 40.5
> .. (only -date- changes, with var fixed at 40) When -var-
> becomes fixed, it
> means that id stopped being updated. Given that I have
> thousands of -id- the
> task of checking this one by one is cumbersome. One way of
> determining this
> is to, for each observation for each Id, calculate the average of the
> remaining values and check if this average is the same as the
> value in var.
> I did the following:
>
> . gsort id -date
>
> . by id: gen n=_n
>
> . by id: gen sum=sum(var)
>
> . by id: gen avg=sum/n
>
> . sort id date
>
> . by id: gen ddate=1 if avg==var
>
>
> Given that ddate returned all the values as missing values, I took the
> difference between avg and var:
>
> . drop ddate
>
> . gen diff=avg-var
>
> When checking the results in diff I realized that diff
> yielded values close
> to 0 but not 0 (something like 8.179e-07). Even with the last
> value, when
> avg is actually equal to var the result was something in the line of
> 8.179e-07 (for instance: var=111.2499, sum=111.2499,
> avg=111.2499, n=1, and
> diff=8.179e-07). I understand that 8.179e-07 is close 0, and
> I could do
> something like:
> . replace diff=0 if abs(diff)<0.00001
> But I'm afraid I could lose some observations. Any ideas
> about the reasons
> for this to happen and how to solve this? The values for var
> are truncated
> to 4 decimal points by database download.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/