Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Conditional Variable means to new observation
From
Nickolas Lyell <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Conditional Variable means to new observation
Date
Wed, 4 Sep 2013 15:44:42 -0400
Ok.
I am still stuck with this problem. I would like to make a new observation that encompasses Large counties. I already have a dummy for large counties and would just like to sum all of each year's observations where Large County is equal to 1.
I want to do this so I can analyze all large counties the same way I am looking at each observation. I don't plan on doing any regressions on this dataset, merely creating new variables to help me understand the data and I am using Stata because the programmatic approach of a do file makes sense to me when dealing with these very large files.
Could anyone please help.
Nicholas Lyell
-----Original Message-----
From: Nickolas Lyell
Sent: Friday, August 30, 2013 9:28 AM
To: '[email protected]'
Subject: RE: st: Conditional Variable means to new observation
I see, thank you.
Nicholas Lyell
Research Associate
National Association of Counties | NACo
[email protected] | 202.661.8820
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Maarten Buis
Sent: Friday, August 30, 2013 9:25 AM
To: [email protected]
Subject: Re: st: Conditional Variable means to new observation
On Fri, Aug 30, 2013 at 3:05 PM, Nickolas Lyell wrote:
> I am looking to take a conditional mean (or sum) of a variable and include it as a new observation.
>
> For instance, I have data with several county indicators horizontally and county ids vertically. I would like to take the mean growth rate (a variable) for only those counties that are Large (LgMdSm==2) and create a new observation that contains that value under the variable growth rate.
You almost never want to store those numbers as an extra row in your data. Stata takes the definition of a dataset very strictly, and rightly so: the rows are the units and the columns are characteristics of those units. All large counties together does not represent a new unit. However, that mean growth rate you want to compute is a characteristic shared by all counties that are "large": so that mean has to be stored as a column. Here are two ways of computing such
means:
*------------------ begin example ------------------ // create some example data clear set obs 10 gen county_id = _n gen LgMdSm = (_n > 5) + 1 gen growth = rnormal()
// first method
egen mean_growth = mean(growth) if LgMdSm == 2
// second method
bys LgMdSm : egen mean_growth2 = mean(growth)
// see the results
list
*------------------- end example -------------------
* (For more on examples I sent to the Statalist see:
* http://www.maartenbuis.nl/example_faq )
Hope this helps,
Maarten
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/