Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Easy Question? Counting cases based on a "target" case
From
"David Radwin" <[email protected]>
To
<[email protected]>
Subject
st: RE: Easy Question? Counting cases based on a "target" case
Date
Wed, 26 Dec 2012 10:24:58 -0800 (PST)
Ben,
I don't think you need to loop over observations, but you can loop over
values which is fairly efficient. Something like this:
levelsof price, local(prices)
foreach p of local prices {
gen near`p' = inrange(price, `=`p'-2000', `=`p'+2000')
}
egen countnear = rowtotal(near*)
In the example above I use all prices, but you could substitute the
following line for the first and second line above:
foreach p of numlist 1900 2500 4000 6500 10000 {
David
--
David Radwin
Senior Research Associate
MPR Associates, Inc.
2150 Shattuck Ave., Suite 800
Berkeley, CA 94704
Phone: 510-849-4942
Fax: 510-849-0794
www.mprinc.com
> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Ben Hoen
> Sent: Wednesday, December 26, 2012 10:06 AM
> To: [email protected]
> Subject: st: Easy Question? Counting cases based on a "target" case
>
> I want to perform a function that I think would be easy but I can't wrap
> my
> head around how to perform it without looping through each case.
>
> I want to create a count of the number of records in the file that meet
a
> certain criteria based on a respective case's value. So for example
using
> the auto dataset:
>
> *====================begin
> sysuse auto, clear
> g id=_n
> egen nearprice2000=count(id) if... //count the number of other cases in
> the
> dataset if the price of the car is within $2000 of the price of this
> cases'
> (i.e., target) car's price
>
> *====================end
>
> The egen command is how I thought I would resolve this, but I can't
figure
> it out exactly. The nearprice2000 would equal the count for each case
of
> the number of other cases in the dataset that have a price that is
either
> +/- $2000 from the particular case's price. So if the full dataset had
> only
> 5 prices: 1900, 2500, 4000, 6500, and 10000, their respective
nearprice200
> values would be: 2, 3, 2, 2, and 1 (if itself would be included in the
> count) or 1, 2, 1, 1, and 0 (if itself would NOT be included in the
count)
>
> I might be able to do this by looping through the cases, but I know that
> is
> not encouraged by other more experienced users.
>
> Any advice would be greatly appreciated.
>
> Ben
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/