Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: generating observations in data set
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: generating observations in data set
Date
Thu, 7 Mar 2013 12:15:39 +0000
"Missing" is naturally a treacherous word here: although you carefully
said "missing observations" that is all too likely to be read as
"observations with missing values".
If something might (should) be in the dataset, but is not, I prefer to
say "omitted" but my chances of convincing the world on this point are
tiny.
However, terminology is not the point here.
-fillin- is your friend, e.g.
fillin yydx dis age
replace grp_count = 0 if grp_count == .
See -help fillin- as usual and if so desired
SJ-5-1 dm0011 . . . . . . . . . . . . . . Stata tip 17: Filling in the gaps
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/05 SJ 5(1):135--136 (no commands)
tips for using fillin to fill in gaps in a rectangular
data structure
which is accessible via
http://www.stata-journal.com/sjpdf.html?articlenum=dm0011
Nick
On Thu, Mar 7, 2013 at 11:23 AM, Tim Evans <[email protected]> wrote:
> I am trying to calculate age standardised incidence rates using -distrate- which is a user written package (accessible by -ssc install distrate-) in Stata 11.2, but need help in order to identify where I have missing levels of data in my dataset.
>
> I have 5 year age groups and am looking at type 1 and type 2 disease. For type 1 disease I have observations in every age group from 0-4 and 85+, but in type 2 disease there is an absence of observations in 0-4 and 10-14 age group. What I would like to do is evaluate whether there are any 'missing' observations and insert a row for that age group and set the number of observations to 0 - this may happen many times in my data as I have multiple years of data. My data look like this:
>
> dis yydx age_grp count
> 1 2003 0-4 321
> 1 2003 5-9 266
> 1 2003 10-14 201
> 1 2003 15-19 167
> 1 2003 20-24 150
> 2 2003 5-9 266
> 2 2003 15-19 167
> 2 2003 20-24 100
>
> I would like to be able to change it to this:
>
> dis yydx age_grp count
> 1 2003 0-4 321
> 1 2003 5-9 266
> 1 2003 10-14 201
> 1 2003 15-19 167
> 1 2003 20-24 150
> 2 2003 0-4 0
> 2 2003 5-9 266
> 2 2003 10-14 0
> 2 2003 15-19 167
> 2 2003 20-24 100
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/