I think this needs a tweak:
bysort date (sum): keep if _n == 1
will ensure that the first value of
-sum- in each group after sorting is missing
if and only if -sum- is missing
on all values in each group. With the
code as is stands you could lose sums
you want to keep.
Nick
[email protected]
Friedrich Huebler
> Eric,
>
> Here is one way to preserve the missing value.
>
> . bysort date: egen sum = sum(amount)
> . replace sum = . if amount==.
> (2 real changes made, 2 to missing)
> . bysort date: keep if _n==1
> (3 observations deleted)
> . drop amount
> . rename sum amount
> . clist, noobs
>
> date amount
> 10-Oct-1990 189
> 11-Oct-1990 .
> 12-Oct-1990 107
>
> Friedrich Huebler
>
> --- "Eric G. Wruck" <[email protected]> wrote:
> > I just learned, rather inconveniently, that collapse doesn't work
> > the
> > way I'd like when encountering missing values. Here's an example:
> > . l
> >
> > +----------------------+
> > | date amount |
> > |----------------------|
> > 1. | 10-Oct-1990 200 |
> > 2. | 10-Oct-1990 -75 |
> > 3. | 10-Oct-1990 64 |
> > 4. | 11-Oct-1990 . |
> > 5. | 12-Oct-1990 107 |
> > |----------------------|
> > 6. | 12-Oct-1990 . |
> > +----------------------+
> >
> > . collapse (sum) net_amt=amount, by(date)
> >
> > . l
> >
> > +-----------------------+
> > | date net_amt |
> > |-----------------------|
> > 1. | 10-Oct-1990 189 |
> > 2. | 11-Oct-1990 0 |
> > 3. | 12-Oct-1990 107 |
> > +-----------------------+
> >
> > .
> > The problem is for the single 11-Oct-1990 observation. After
> > collapsing, the missing value becomes a zero; in this instance I
> > would have preferred it remain missing. The 12-Oct-1990 treatment
> > is
> > fine & what I expected. I suppose I could delete observations
> > before
> > performing the collapse but it would be better if there was some
> > other option. Is there?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/