OK, joking aside:
We have two approaches to doing it. The
comparison is interesting.
(1) no change of data shape.
(a) add extra pseudo-observations
(b) -fillin-
(c) drop extra pseudo-observations
For what I recommended, you don't have to go into -edit-.
Let's look at the all command version.
set obs `= _N + 12'
tokenize "1983 1986 1989 1990 1993 1994 1997 1998 2001 2002 2003 2004"
forval i = 1/12 {
replace year = ``i'' in -`i'
}
fillin fip county year
drop if county == . | fip == .
(2) change of data shape.
(a) -reshape-
(b) new variables
(c) -reshape- back
egen id=group(county fip)
reshape wide vbl, i(id) j(year)
foreach y of num 1983 1986 1989 1990 1993 1994 1997 1998 2001-2004 {
gen vbl`y' = .
}
reshape long vbl, i(id) j(year)
drop id
So it looks like 7 lines of code in each case. I eat some words....
Nick
[email protected]
Scott Cunningham
> Sent: 24 November 2006 16:07
> To: [email protected]
> Subject: Re: st: RE: help expanding the dataset by n observations
>
>
> On Nov 24, 2006, at 10:57 AM, Nick Cox wrote:
>
> > No, no, no. You want extra observations.
> > Try my advice. It should take about two minutes.
> > -reshape- is wonderful, but not the answer here.
>
> Really? It is making extra observations. I have to reshape first
> long, generate the variables with the missing data, then reshape
> back, but it worked. Here's what I did. Here's a sample
> slice of my
> dataset:
>
> +-----------------------------+
> | county fip year vbl |
> |-----------------------------|
> 1. | 1 1 1980 . |
> 2. | 1 1 1981 . |
> 3. | 1 1 1982 . |
> 4. | 1 1 1984 . |
> 5. | 1 1 1985 . |
> |-----------------------------|
> 6. | 1 1 1987 . |
> 7. | 1 1 1988 . |
> 8. | 1 1 1991 . |
> 9. | 1 1 1992 . |
> 10. | 1 1 1995 . |
> |-----------------------------|
> 11. | 1 1 1996 . |
> 12. | 1 1 1999 . |
> 13. | 1 1 2000 . |
> 14. | 3 1 1980 1 |
> 15. | 3 1 1981 1 |
> |-----------------------------|
> 16. | 3 1 1982 1 |
> 17. | 3 1 1984 0 |
> 18. | 3 1 1985 0 |
> 19. | 3 1 1987 0 |
> 20. | 3 1 1988 0 |
> |-----------------------------|
> 21. | 3 1 1991 0 |
> 22. | 3 1 1992 0 |
> 23. | 3 1 1995 0 |
> 24. | 3 1 1996 0 |
> 25. | 3 1 1999 0 |
> |-----------------------------|
> 26. | 3 1 2000 0 |
> +-----------------------------+
>
>
> . egen id=group(county fip)
> . reshape wide vbl, i(id) j(year)
> . gen vbl1983=.
> . gen vbl1986=.
> ...
> . gen vbl2004=.
> . reshape long vbl, i(id) j(year)
> . list
>
> +----------------------------------+
> | id year county fip vbl |
> |----------------------------------|
> 1. | 1 1980 1 1 . |
> 2. | 1 1981 1 1 . |
> 3. | 1 1982 1 1 . |
> 4. | 1 1983 1 1 . |
> 5. | 1 1984 1 1 . |
> |----------------------------------|
> 6. | 1 1985 1 1 . |
> 7. | 1 1986 1 1 . |
> 8. | 1 1987 1 1 . |
> 9. | 1 1988 1 1 . |
> 10. | 1 1989 1 1 . |
> |----------------------------------|
> 11. | 1 1990 1 1 . |
> 12. | 1 1991 1 1 . |
> 13. | 1 1992 1 1 . |
> 14. | 1 1993 1 1 . |
> 15. | 1 1994 1 1 . |
> |----------------------------------|
> 16. | 1 1995 1 1 . |
> 17. | 1 1996 1 1 . |
> 18. | 1 1997 1 1 . |
> 19. | 1 1998 1 1 . |
> 20. | 1 1999 1 1 . |
> |----------------------------------|
> 21. | 1 2000 1 1 . |
> 22. | 1 2001 1 1 . |
> 23. | 1 2002 1 1 . |
> 24. | 1 2003 1 1 . |
> 25. | 1 2004 1 1 . |
> |----------------------------------|
> 26. | 2 1980 3 1 1 |
> 27. | 2 1981 3 1 1 |
> 28. | 2 1982 3 1 1 |
> 29. | 2 1983 3 1 . |
> 30. | 2 1984 3 1 0 |
> |----------------------------------|
> 31. | 2 1985 3 1 0 |
> 32. | 2 1986 3 1 . |
> 33. | 2 1987 3 1 0 |
> 34. | 2 1988 3 1 0 |
> 35. | 2 1989 3 1 . |
> |----------------------------------|
> 36. | 2 1990 3 1 . |
> 37. | 2 1991 3 1 0 |
> 38. | 2 1992 3 1 0 |
> 39. | 2 1993 3 1 . |
> 40. | 2 1994 3 1 . |
> |----------------------------------|
> 41. | 2 1995 3 1 0 |
> 42. | 2 1996 3 1 0 |
> 43. | 2 1997 3 1 . |
> 44. | 2 1998 3 1 . |
> 45. | 2 1999 3 1 0 |
> |----------------------------------|
> 46. | 2 2000 3 1 0 |
> 47. | 2 2001 3 1 . |
> 48. | 2 2002 3 1 . |
> 49. | 2 2003 3 1 . |
> 50. | 2 2004 3 1 . |
> +----------------------------------+
>
> Then I need to code the missing values (omitted here). This
> solution
> works with many different counties, and so seemed more
> efficient than
> going through -edit-.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/