Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: generating observations in data set
From
Tim Evans <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: generating observations in data set
Date
Thu, 7 Mar 2013 13:21:30 +0000
Nick,
Thanks for that, worked a treat! The workaround I was trying was not as good as your suggestion and would not have worked well with automation. I was trying desperately hard to phrase my 'omitted'/'absent' data but I'm glad you understood what I was trying to achieve!!
Best wishes
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: 07 March 2013 12:16
To: [email protected]
Subject: Re: st: generating observations in data set
"Missing" is naturally a treacherous word here: although you carefully said "missing observations" that is all too likely to be read as "observations with missing values".
If something might (should) be in the dataset, but is not, I prefer to say "omitted" but my chances of convincing the world on this point are tiny.
However, terminology is not the point here.
-fillin- is your friend, e.g.
fillin yydx dis age
replace grp_count = 0 if grp_count == .
See -help fillin- as usual and if so desired
SJ-5-1 dm0011 . . . . . . . . . . . . . . Stata tip 17: Filling in the gaps
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/05 SJ 5(1):135--136 (no commands)
tips for using fillin to fill in gaps in a rectangular
data structure
which is accessible via
http://www.stata-journal.com/sjpdf.html?articlenum=dm0011
Nick
On Thu, Mar 7, 2013 at 11:23 AM, Tim Evans <[email protected]> wrote:
> I am trying to calculate age standardised incidence rates using -distrate- which is a user written package (accessible by -ssc install distrate-) in Stata 11.2, but need help in order to identify where I have missing levels of data in my dataset.
>
> I have 5 year age groups and am looking at type 1 and type 2 disease. For type 1 disease I have observations in every age group from 0-4 and 85+, but in type 2 disease there is an absence of observations in 0-4 and 10-14 age group. What I would like to do is evaluate whether there are any 'missing' observations and insert a row for that age group and set the number of observations to 0 - this may happen many times in my data as I have multiple years of data. My data look like this:
>
> dis yydx age_grp count
> 1 2003 0-4 321
> 1 2003 5-9 266
> 1 2003 10-14 201
> 1 2003 15-19 167
> 1 2003 20-24 150
> 2 2003 5-9 266
> 2 2003 15-19 167
> 2 2003 20-24 100
>
> I would like to be able to change it to this:
>
> dis yydx age_grp count
> 1 2003 0-4 321
> 1 2003 5-9 266
> 1 2003 10-14 201
> 1 2003 15-19 167
> 1 2003 20-24 150
> 2 2003 0-4 0
> 2 2003 5-9 266
> 2 2003 10-14 0
> 2 2003 15-19 167
> 2 2003 20-24 100
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
_DISCLAIMER:
This email and any attachments hereto contains proprietary information, some or all of which may be confidential or legally privileged. It is for the exclusive use of the intended recipient(s) only. If an addressing or transmission error has misdirected this e-mail and you are not the intended recipient(s), please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this e-mail or any attachments, as this may be unlawful.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/