|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: AW: AW: AW: AW: Error in egen rank(), unique?
I see the problem or a counter-intuitive result as follows:
The egen rank(), unique gives the following ranks:
1 2 3 3
If the sequence is 2007m8 2008m5 . 2009m3
In my opinion it is counter-intuitive because
1) I understand unique as giving not identical ranks - but it does if a missing is inserted
2) I would think that a rank command sorts the values in the var on its own and does not need help to perform this
Best regards,
Marc
-----Ursprüngliche Nachricht-----
Von: [email protected] [mailto:[email protected]] Im Auftrag von Martin Weiss
Gesendet: Dienstag, 3. November 2009 15:47
An: [email protected]
Betreff: st: AW: AW: AW: Error in egen rank(), unique?
<>
Honestly, I cannot see the point of the problem, but note that your
*************
sort id dbep
by id, sort: egen rank_dbep = rank(dbep), unique
*************
can be telescoped into
-bys id (dbep): egen rank_dbep = rank(dbep), unique-
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Kaulisch, Marc
Gesendet: Dienstag, 3. November 2009 15:46
An: [email protected]
Betreff: st: AW: AW: Error in egen rank(), unique?
Dear Martin,
Thanks for your answer. Your example works just fine. As it does with a different dataset. The deletion of the qualifier [_n] does not change the behaviour.
I found a work around:
The problem was that the var dbep (being formated as %tm) was not ordered.
Dbep1 dbep2 dbep3 dbep4 dbep5
2007m8 2008m5 . 2009m3 .
This sequence was ranked:
1 2 3 3
After inserting
. sort id dbep
Before the
. by id, sort: egen rank_dbep = rank(dbep), unique It just works fine
Nonetheless I consider this behaviour as not congruent with the description of the Egen rank(), unique function saying "The unique option calculates the unique rank of exp: values are ranked 1,...,#, and values and ties are broken arbitrarily. Two values that are tied for second are ranked 2 and 3. "
(. h egen)
Best regards,
Marc
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Martin Weiss
Gesendet: Dienstag, 3. November 2009 13:18
An: [email protected]
Betreff: st: AW: Error in egen rank(), unique?
<>
No problem occur in this code, so where is the material difference to yours?
*************
clear*
set obs 10
gen id=_n
gen dbep1=5+int(10*runiform())
gen dbep2=5+int(10*runiform())
gen deep1=5+int(10*runiform())
gen deep2=5+int(10*runiform())
gen durep1=rnormal()
gen durep2=rnormal()
reshape long dbep deep durep, i(id) j(episode) by id, sort: egen rank_dbep = rank(dbep), unique by id, sort: egen rank_deep = rank(deep), unique by id,
sort: replace rank_dbep =_n if rank_dbep[_n] == .
sort id rank_dbep
drop episode
reshape wide dbep deep durep rank_deep, i(id) j(rank_dbep)
*************
You do change the values returned by -egen, rank()- with your -replace- line, so it is hard to argue that -egen- is at fault. Still, the line clearly intends to replace the rank by its running number if it is missing.
So insert something like -inspect rank_dbep- before that line to see whether there are any missings in the first place.
Also note that the -if rank_dbep[_n] == .- qualifier could easily be - if rank_dbep == .-...
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Kaulisch, Marc
Gesendet: Dienstag, 3. November 2009 11:27
An: [email protected]
Betreff: st: Error in egen rank(), unique?
I have a problem with an egen rank(), unique command (Stata version 10.1).
It looks like it does not produce the unique values as I like.
This is my start to rank episodes by their beginning (dbep). First I reshape the dataset into the long-format.
. reshape long dbep deep durep, i(id) j(episode)
. by id, sort: egen rank_dbep = rank(dbep), unique . by id, sort: egen rank_deep = rank(deep), unique
. by id, sort: replace rank_dbep = _n if rank_dbep[_n] == .
. sort id rank_dbep
When I do want to transform this dataset back in the wide format:
. drop episode
. reshape wide dbep deep durep rank_deep , i(id) j(rank_dbep)
I receive the following error:
rank_dbep not unique within id;
there are multiple observations at the same rank_dbep within id.
Type "reshape error" for a listing of the problem observations.
r(9);
reshape error gives out the number of 15 cases in which rank_dbep is not unique.
Strangely enough with a different dataset the same commands work just fine.
Any ideas?
Marc Kaulisch
iFQ
Institut für Forschungsinformation und Qualitätssicherung Godesberger Allee 90
53175 Bonn, Germany
Tel.: *49 - 228 - 9 72 73 - 25
Fax: *49 - 228 - 9 72 73 - 49
E-mail: [email protected]
www.forschungsinfo.de
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/