Metcalfe, Paul
> Using Stata 8.1 SE, I'm trying to put together a loop for
> what I imagine should be quite a straightforward task.
>
> The relevant part of my data looks like the following:
>
> id cons time
> 5001574 32
> 5001574 31
> 5001574 0.278548 30
> 5001574 0.271683 29
> 5001574 0.378903 28
> 5001574 0.291933 27
> 5001574 0.319807 26
> 5001574 25
> 5001574 24
> 5001574 23
> 5001574 22
> 5001574 21
> 5001574 0.348804 20
> 5001574 0.247645 19
> 5001574 0.306516 18
> 5001574 0.303717 17
> 5001574 0.310532 16
>
> I have about 8000 different id values in the full dataset,
> observed for different stretches of time with different
> numbers of gaps in the cons variable in different places
> across the set of ids.
>
> What I would like to do is drop the observations at the end
> of the time series where cons=., but keep the observations
> in the middle. There are varying numbers of gaps in the
> cons time series for different ids, and I want to keep all
> of them except the observations at the end of the time
> series for each id. I've tried a number of different
> combinations of the while, if and foreach commands, but
> none of them has worked, so I hoped that someone on the
> list could help.
This seems a bit awkward, but it satisfies a Stataish
preference for -by:- over looping over observations:
There is a block of missing values to drop at
the start of each panel if and only if the _first_
value in each panel is missing. So let's get the
cumulative sum of mi(cons) in that case.
bysort id (t) : gen todrop = sum(mi(cons)) if mi(cons[1])
The values to drop will be those in which this cumulative
sum is exactly the same as _n: for example, if
the first three only are missing, the cumulative
sum will be 1, 2, 3, 3 ... and only for the
first three is this true. Here we lean on the fact
that under -by id:- _n is evaluated within each panel.
(Similarly, in the code above [1] always means the
first observation within each panel.)
bysort id (t) : drop if todrop == _n
To get the blocks at the end of each panel,
we work with time measured backwards:
gen bt = -t
bysort id (bt) : gen todrop2 = sum(mi(cons)) if mi(cons[1])
bysort id (bt) : drop if todrop2 == _n
after which we clean up:
drop todrop todrop2 mt
tsset
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/