Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: How can I fill in missing values for the month or quarter in this data set?
From
Aaron Kirkman <[email protected]>
To
[email protected]
Subject
st: How can I fill in missing values for the month or quarter in this data set?
Date
Wed, 14 Nov 2012 13:47:46 -0600
Dear Statalist,
I have a dataset in this format:
#####
input year str8 state quarter expenditure
1998 Illinois 1 1000
1998 Illinois 3 3000
1998 Illinois 4 4000
1998 Illinois 2 2000
2000 Illinois 1 5000
2000 Illinois 4 11000
2000 Illinois 2 6000
2000 Illinois 3 8000
2001 Illinois 2 1575
2002 Illinois 4 5600
1998 Missouri 3 2100
1998 Missouri 4 2100
1998 Missouri 1 1900
1998 Missouri 2 1900
2000 Missouri 1 8500
2001 Missouri 1 8000
2001 Missouri 2 9800
2001 Missouri 3 7500
2001 Missouri 4 1800
2002 Missouri 3 1400
end
#####
I'd like to fill in the missing quarters for each state/year group, so
that each state/year group has four observations that include quarters
1 - 4. As long as the --expenditure-- value is preserved for the
quarter in which it occurs in the original dataset, it can be missing
for the rest.
I attempted a solution using --expand--, but I'm not sure how to
proceed from here.
#####
bysort state year: egen numquarters = count(expenditure)
gen quartermissing = numquarters < 4
drop numquarters
expand 4 if quartermissing, gen(new)
sort state year
#####
This successfully fills in the missing quarters with the value of the
only existing quarter, and it's trivial to set these values to missing
instead. The variables --quartermissing-- and --new-- indicate if
quarters were missing and if new observations were created, but I'm
stuck as to how I can proceed from there.
If there is a way to sort the observations by group, while ensuring
that the observation where quarter was not missing remained in its
correct position, setting quarter to the observation number by group
would work. Unfortunately, I'm not sure how to go about sorting in
this way, since logically, the sorting will be dealing with either
missing values or identical quarter numbers, which complicates the
situation.
Is there a simple way to accomplish this that I'm missing?
Thank you,
Aaron Kirkman
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/