Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How can I fill in missing values for the month or quarter in this data set?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: How can I fill in missing values for the month or quarter in this data set?
Date
Wed, 14 Nov 2012 20:04:39 +0000
-fillin- is the command of choice here.
. fillin state year quarter
That said, the larger issue is why you want to do this. -fillin-
without any extras will just extend your dataset and add missings on
-expenditure-, but necessarily no extra information.
As elsewhere I've posted on interpolation, it may seem odd that I ask
this, but what will you gain thereby?
Nick
On Wed, Nov 14, 2012 at 7:47 PM, Aaron Kirkman <[email protected]> wrote:
> I have a dataset in this format:
>
> #####
> input year str8 state quarter expenditure
> 1998 Illinois 1 1000
> 1998 Illinois 3 3000
> 1998 Illinois 4 4000
> 1998 Illinois 2 2000
>
> 2000 Illinois 1 5000
> 2000 Illinois 4 11000
> 2000 Illinois 2 6000
> 2000 Illinois 3 8000
>
> 2001 Illinois 2 1575
>
> 2002 Illinois 4 5600
>
> 1998 Missouri 3 2100
> 1998 Missouri 4 2100
> 1998 Missouri 1 1900
> 1998 Missouri 2 1900
>
> 2000 Missouri 1 8500
>
> 2001 Missouri 1 8000
> 2001 Missouri 2 9800
> 2001 Missouri 3 7500
> 2001 Missouri 4 1800
>
> 2002 Missouri 3 1400
> end
> #####
>
> I'd like to fill in the missing quarters for each state/year group, so
> that each state/year group has four observations that include quarters
> 1 - 4. As long as the --expenditure-- value is preserved for the
> quarter in which it occurs in the original dataset, it can be missing
> for the rest.
>
> I attempted a solution using --expand--, but I'm not sure how to
> proceed from here.
>
> #####
> bysort state year: egen numquarters = count(expenditure)
> gen quartermissing = numquarters < 4
> drop numquarters
> expand 4 if quartermissing, gen(new)
> sort state year
> #####
>
> This successfully fills in the missing quarters with the value of the
> only existing quarter, and it's trivial to set these values to missing
> instead. The variables --quartermissing-- and --new-- indicate if
> quarters were missing and if new observations were created, but I'm
> stuck as to how I can proceed from there.
>
> If there is a way to sort the observations by group, while ensuring
> that the observation where quarter was not missing remained in its
> correct position, setting quarter to the observation number by group
> would work. Unfortunately, I'm not sure how to go about sorting in
> this way, since logically, the sorting will be dealing with either
> missing values or identical quarter numbers, which complicates the
> situation.
>
> Is there a simple way to accomplish this that I'm missing?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/