| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Re: bysort problem
Sergiy Radyakin wrote:
Hi Nikolaos!
I guess this code works:
***--------------------------------------------------------------------------------------
clear
set more off
input var1 var2
.145 .14
.145 .15
.145 .15
.167 .15
1.89 .15
1.89 .16
1.89 .16
end
list
tempvar _id
local _idN=2
while `_idN'!=1 {
di "------------------------------"
qui gen `_id'=0
list
bysort var1 var2: replace `_id'=_n
replace var1=var1+(`_id'-1)*0.001 if `_id'>1
sum `_id', detail
local _idN=r(max)
list
drop `_id'
di "=============================="
}
***--------------------------------------------------------------------------------------
However I do not understand why do you write "The obs in 3 and 6
should have _id (__000002) 0.016 and ) 0.017,
respectively." You change var1, not the temporary id variable. Why do
you expect _id==0.016?
Have you considered a possibility that adding 0.001 might assign your
observation to a different group (defined by a pair of your var1;var2
variables?) Or is it exactly the desired behaviour? Imagine var2=const
for all observations. You have var1 for obs 1 to 1000 equal to 0.001
to 1. And you have one more observation with var1=0.001. This
procedure will add 0.001 1000 times moving this observation all the
way to 1.001.
Regards,
Sergiy
----- Original Message ----- From: "Nikolaos A. Patsopoulos"
<[email protected]>
To: <[email protected]>
Sent: Monday, February 26, 2007 8:50 AM
Subject: st: bysort problem
I'm currently writing a program that in some point checks if more
than observations have two vars (E and SE) equal. If more than one
exists then SE is increased by 0.001:
tempvar _id
qui gen `_id'=0
local _idN=2
while `_idN'!=1 {
bysort `E' `SE': replace `_id'=_n if `touse'
count if `_id'>1 & `touse'
replace `SE'=`SE'+(`_id'-1)*0.001 if `_id'>1
sum `_id' if `touse', detail
local _idN=r(max)
list `E' `SE' `_id' if `touse'
}
when I run the above piece of code bysort fails in the second pass:
2
(2 real changes made)
__000002
-------------------------------------------------------------
Percentiles Smallest
1% 1 1
5% 1 1
10% 1 1 Obs 7
25% 1 1 Sum of Wgt. 7
50% 1 Mean 1.285714
Largest Std. Dev. .48795
75% 2 1
90% 2 1 Variance .2380952
95% 2 2 Skewness .9486833
99% 2 2 Kurtosis 1.9
+------------------------+
| var1 var2 __000002 |
|------------------------|
1. | .145 .014 1 |
2. | .145 .015 2 |
3. | .145 .015 1 |
4. | .167 .015 1 |
5. | 1.89 .015 1 |
|------------------------|
6. | 1.89 .016 2 |
7. | 1.89 .016 1 |
+------------------------+
0
(0 real changes made)
__000002
-------------------------------------------------------------
Percentiles Smallest
1% 1 1
5% 1 1
10% 1 1 Obs 7
25% 1 1 Sum of Wgt. 7
50% 1 Mean 1
Largest Std. Dev. 0
75% 1 1
90% 1 1 Variance 0
95% 1 1 Skewness .
99% 1 1 Kurtosis .
+------------------------+
| var1 var2 __000002 |
|------------------------|
1. | .145 .014 1 |
2. | .145 .015 1 |
3. | .145 .015 1 |
4. | .167 .015 1 |
5. | 1.89 .015 1 |
|------------------------|
6. | 1.89 .016 1 |
7. | 1.89 .016 1 |
+------------------------+
The obs in 3 and 6 should have _id (__000002) 0.016 and ) 0.017,
respectively.
What do I miss?
Another sort question:
How can I label tempvars and locals?
Thanks in advance,
Nikos
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
The purpose of the code is to eliminate duplicate observations of var1 &
var2 (combination).
In the firest pass the algorithm fixes duplicates but new ones might
come-up, so it should run as long as noone is left. The correction is
too small for the real data (the ones here are dummy ones just to test
the code) and the possibility of duplicates very small but still present.
var2 is changed not var1. This was a mistake I made on earlier e-mail.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/