Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Increase observations by group and then calculate percentages
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Increase observations by group and then calculate percentages
Date
Thu, 3 Nov 2011 08:01:39 +0000
I've got to say that I think this is a bad idea. It's a spreadsheet
idea imported into a statistical software context. Group
characteristics are best stored as group characteristics. If you do
this, then you have to remember to exclude the observations with
summaries from every analysis thereafter. It seems unlikely that you
would really prefer to do that.
-egen- in conjunction with -by:- or -by()- is a handy tool to generate
group summaries. There is also a -tag()- function for a common need to
look at each group summary just once.
The main trick to do what you want is to use -expand-, as a thread
started a few hours ago by Fernando Luco does show.
Nick
On Thu, Nov 3, 2011 at 2:14 AM, Catherine Tisch
<[email protected]> wrote:
> Hi all
>
> I'm hoping someone can help me with my problem - I'd like to insert an
> observation after every group and then create percentages by group in my
> panel dataset.
>
> My dataset looks something like this:
>
> AreaID Ethnicity Sex1Age1 Sex2Age1 Sex1Age2
> Sex2Age2 and so on....
> 1 GroupX 18 1
> 1 Total 57 21
> 2 GroupX 33 81
> 2 Total 528 147
> and so on....
>
> Ethnicity is stored as a string, all other variables are stored as
> float. I have approximately 2000 AreaIDs and 16 Sex/Age groups. I am
> using Stata 11.1.
>
> I'm hoping my final dataset might look something like this:
>
> AreaID Ethnicity Sex1Age1 Sex2Age1 Sex1Age2
> Sex2Age2 and so on....
> 1 GroupX 18 16
> 1 Total 57 21
> 1 PercentX 31.6 76.2
> 2 GroupX 33 81
> 2 Total 528 147
> 2 PercentX 6.3 55.1
> and so on.....
>
> Where PercentX is GroupX/Total for each AreaID by the Sex/Age variables.
>
> I think the first stage I need to do is increase the number of
> observations in my dataset by using -set obs-. I tried
>
> set obs `=_N+1', by AreaID
>
> but got the error message 'options not allowed'. How can I get around
> this?
>
> Once I get that sorted, the second stage is calculating percentages and
> I think I need to use -by- function and any suggestions would be most
> welcome.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/