Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Conditional Variable means to new observation
From
Nickolas Lyell <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Conditional Variable means to new observation
Date
Wed, 11 Sep 2013 12:56:07 -0400
Actually, I'm now trying to do a similar process of summations and means but I would like to use a bysort command with it.
My data looks something like this:
fips_num NAICS sector GDP
1001 112 agriculture 240
1001 114 agriculture 310
...
1002 112 agriculture 105
1002 114 agriculture 112
As such, I have over 3000 counties (id = fips_num) and would like to do what Phil suggested below. Copied here with the relevant fields:
preserve
ds _all, has(type numeric)
collapse (sum) `r(varlist)' if sector=="agriculture"
tempfile totals
save `"`totals'"'
restore
append using `"`totals'"'
but I would like to do it bysort fips_num.
Again, I realize this isn't something you normally want to do to a dataset, but seeing as I'm not planning on doing any kind of regression on this data, having the sums as new observations, rather than variables, would make my life much easier. Optimally, I would then drop all the non-sum variables, so I don't necessarily need the to append the tempfile. I could, rather, just use a command that would create a new file with all the "sector" sums by fips_num.
Nicholas Lyell
Research Associate
National Association of Counties | NACo
[email protected] | 202.661.8820
-----Original Message-----
From: Nickolas Lyell
Sent: Wednesday, September 04, 2013 4:28 PM
To: '[email protected]'
Subject: RE: st: Conditional Variable means to new observation
Thanks Phil
Nicholas Lyell
Research Associate
National Association of Counties | NACo
[email protected] | 202.661.8820
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Phil Schumm
Sent: Wednesday, September 04, 2013 4:26 PM
To: [email protected]
Subject: Re: st: Conditional Variable means to new observation
On Sep 4, 2013, at 2:44 PM, Nickolas Lyell <[email protected]> wrote:
> I am still stuck with this problem. I would like to make a new observation that encompasses Large counties. I already have a dummy for large counties and would just like to sum all of each year's observations where Large County is equal to 1.
>
> I want to do this so I can analyze all large counties the same way I am looking at each observation. I don't plan on doing any regressions on this dataset, merely creating new variables to help me understand the data and I am using Stata because the programmatic approach of a do file makes sense to me when dealing with these very large files.
>
> Could anyone please help.
While Maarten is in general correct (i.e., you should not try to use Stata like you would a spreadsheet program), it sounds to me like what you want is essentially to create a (possibly large) data table with a row of totals at the bottom. If so, Stata can certainly oblige (and if you are not already facile with another programming language -- and even if you are -- it would not be unreasonable to use Stata for this task). Here is probably the simplest approach:
preserve
ds _all, has(type numeric)
collapse (sum) `r(varlist)' if LgMdSm==2
tempfile totals
save `"`totals'"'
restore
append using `"`totals'"'
Note that the call to -ds- is merely to avoid trying to sum any string variables you might have, which will generate an error. In addition, may wish to add to this something like
gen byte totals = (_n==_N)
so that you can easily identify (and exclude) this row later on, if necessary. But as Maarten said, if you are going to go on to analyze these data in Stata, it would be best to delete this additional row(s) first, to avoid it causing problems.
Finally, note that this approach is quite general, as it will accommodate any of the types of summaries that -collapse- will generate. And you may want to use -export excel- to move the resulting data table to Excel for visual inspection, where features such as splits will allow you to hold the total row fixed while scrolling through the rows of data.
-- Phil
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/