Thank you Phil, this works! I had glanced at the -bysort- command beforehand, but hadn't figured to do it this way. I am especially unfamiliar with the use of the "= _n == _N" syntax though, even though I just searched for it. What does it mean...?
CM
-----Original Message-----
From: [email protected] on behalf of Philip Ryan
Sent: Tue 9/30/2003 12:22 AM
To: [email protected]
Cc:
Subject: Re: st: Short program to "collapse (# unique elements)": Use of nested loops and a "weights not allowed" message
This is a bit simpler and I think does what you want:
bysort citing nclass: gen byte unique = _n == _N
bys citing: replace unique = sum(unique)
by citing: keep if _n == _N
You should test this and maybe tweak it to deal with missing values, if
they exist in your data.
One point in your code: any command of this form "replace[_n]" will
generate an error code because Stata thinks your square brackets are
introducing weights and the syntax for -replace- does not permit
these. Also, you cannot use explicit indexing on a variable on the LHS of
a value assignment ("=") command.
Phil
At 11:41 PM 29/09/2003 -0500, you wrote:
>Hi statalisters,
>
>I have been working on a short program that doesn't seem to work, I think
>I'm just missing a small mistake... I have a data file with three
>columns: citing, cited, nclass. For every "citing", there are multiple
>"cited", and for each "cited" there is a "nclass". The file is sorted by
>citing, then nclass. I need a program to count the number of unique
>"nclass" strings associated to each "citing".
>
>As a simple example, given the following data file "data.dta":
>
>citing cited nclass
>100 20 12
>100 22 15
>100 23 15
>101 32 14
>101 33 15
>101 34 15
>101 40 17
>
>I need the following output file:
>
>citing numpatclass
>100 2 [12 and 15 are unique, 15 is repeated]
>101 3 [14, 15, 17 are unique, 15 is repeated]
>
>I have decided to do it by creating an intermediate file which I will
>later collapse(max):
>
>citing cited nclass indexpatclass
>100 20 12 1
>100 22 15 2
>100 23 15 2
>101 32 14 1
>101 33 15 2
>101 34 15 2
>101 40 17 3
>
>"indexpatclass" indexes by 1 whenever a "citing" involves a new "nclass",
>and resets to 1 whenever a new "citing" begins. So I have created a short
>program. It sorts by "citing" and "nclass", then it uses a while-loop,
>and then two if-loops. But there are two problems: (1) I am getting a
>"weights not allowed" message when I try to run it. (2) I am also not
>sure whether I am properly nesting my loops. Can anybody provide any
>insight? Or alternatively, is there a much simpler way to do what I am
>attempting?
>
>Thanks, --Chihmao.
>
>--------------------------------
>
># delimit cr
>program define uniqpatclass
>use c:\temp\data
>generate indexpatclass=0
>sort citing nclass
>replace indexpatclass=1 in 1
>generate id=_n
>
>while id<_N {
> if citing[_n]==citing[_n-1] {
> if nclass[_n]==nclass[_n-1] {
> replace indexpatclass[_n]=indexpatclass[_n-1]
> id = `id' + 1
> }
> else {
> replace indexpatclass[_n]=indexpatclass[_n-1]+1
> id = `id' + 1
> }
> }
> else {
>replace indexpatclass[_n]=1}
>id = `id' + 1
>}
>end
>
>
>
>*
>* For searches and help try:
>* http://www.stata.com/support/faqs/res/findit.html
>* http://www.stata.com/support/statalist/faq
>* http://www.ats.ucla.edu/stat/stata/
Philip Ryan
Associate Professor,
Department of Public Health
Associate Dean (Information Technology)
Faculty of Health Sciences
University of Adelaide 5005
South Australia
tel 61 8 8303 3570
fax 61 8 8223 4075
http://www.public-health.adelaide.edu.au/
CRICOS Provider Number 00123M
-----------------------------------------------------------
This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/