Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: creating a new variable
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: creating a new variable
Date
Wed, 18 Jul 2012 12:40:38 +0100
Here are five solutions for a similar problem.
. sysuse auto
. tab rep78, su(mpg)
Repair | Summary of Mileage (mpg)
Record 1978 | Mean Std. Dev. Freq.
------------+------------------------------------
1 | 21 4.2426407 2
2 | 19.125 3.7583241 8
3 | 19.433333 4.1413252 30
4 | 21.666667 4.9348699 18
5 | 27.363636 8.7323849 11
------------+------------------------------------
Total | 21.289855 5.8664085 69
. tabstat mpg , by(rep78)
Summary for variables: mpg
by categories of: rep78 (Repair Record 1978)
rep78 | mean
---------+----------
1 | 21
2 | 19.125
3 | 19.43333
4 | 21.66667
5 | 27.36364
---------+----------
Total | 21.28986
--------------------
. graph dot (mean) mpg, over(rep78) vertical
. egen mean_mpg = mean(mpg), by(rep78)
. scatter mean_mpg rep78
. dotplot mpg, over(rep78) bar
On Wed, Jul 18, 2012 at 11:34 AM, Amal Khanolkar <[email protected]> wrote:
> I have a very simple problem that I'm unable to find a simple solution for:
>
> Below is the data concerned:
>
> Gestational age in weeks:
>
> tab gestwk
>
> gestwk | Freq. Percent Cum.
> ------------+-----------------------------------
> 22 | 134 0.00 0.00
> 23 | 387 0.01 0.02
> 24 | 738 0.02 0.04
> 25 | 1,235 0.04 0.08
> 26 | 1,688 0.06 0.14
> 27 | 2,125 0.07 0.21
> 28 | 2,723 0.09 0.30
> 29 | 3,415 0.11 0.42
> 30 | 4,481 0.15 0.57
> 31 | 5,876 0.20 0.76
> 32 | 8,533 0.29 1.05
> 33 | 12,958 0.43 1.49
> 34 | 21,420 0.72 2.20
> 35 | 36,710 1.23 3.44
> 36 | 70,297 2.36 5.79
> 37 | 151,310 5.07 10.87
> 38 | 373,660 12.53 23.40
> 39 | 660,536 22.15 45.55
> 40 | 822,376 27.58 73.13
> 41 | 542,442 18.19 91.33
> 42 | 219,603 7.37 98.69
> 43 | 31,928 1.07 99.76
> 44 | 5,470 0.18 99.94
> 45 | 1,648 0.06 100.00
> ------------+-----------------------------------
> Total | 2,981,693 100.00
>
>
> Mean birth weight of my study sample:
>
> . sum bw
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> bw | 2980093 3502.431 575.7603 300 6780
>
> sum bw if gestwk==26
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> bw | 1610 902.7248 189.5523 350 1970
>
> . sum bw if gestwk==26
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> bw | 1610 902.7248 189.5523 350 1970
>
>
> Below, if I would like to look at the mean birth weight for a particular gestational week:
>
> . sum bw if gestwk==27
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> bw | 2024 1014.961 201.809 380 1920
>
> . sum bw if gestwk==28
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> bw | 2613 1138.658 238.724 370 2000
>
> . sum bw if gestwk==29
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> bw | 3316 1295.815 278.1803 370 2480
>
>
> What I would like to do is to create a single continuous variable that would give me the mean birth weight for each gestational week so that I don't have to look at it individually as above. I would like to ideally be able to use this variable in scatter plots.
>
> If I plot as follows:
>
> scatter twoway bw gestwk
>
> I of course don't get a single estimate for each gestational week, but instaed the entire range of birth weight for a particular week is plotted.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/