Correct. Consult the help:
-collapse- converts the dataset in memory into a dataset of means, sums, medians,
etc. clist must refer to numeric variables exclusively.
What you can do is -- if your description is correct --
egen nmiss = rowmiss(<insert variable names>)
bysort id (nmiss) : keep if _n == 1
as the sort will sort the observation with
more missings to second place.
Nick
[email protected]
Daphna Bassok
> I have several duplicate observations in my data set.
> However, they are
> not perfect duplicates. Only the id # is the same. So there might be
> two observations with id#16 for instance, the first will have
> values for
> some variables, and missing values for others. The second
> also have some
> values filled and some missing. There are no cases in which
> both have
> values- that is... either the first in the pair has the value OR the
> second has a value (or neither).
>
> For example: suppose I have two observations with id# 16...
> The first
> has values for var1 and 2 and not 3. The second ONLY has
> values for
> var 3. What i would like to do is simply collapse these
> into a single
> observation with all the relevant info. meaning, 1 observation with
> id#16 that has values for all three variables.
>
> I am trying to do this with the collapse command with no success.
>
> My code is:
>
> collapse (min) var1-var3, by(id)
>
> I thought this would create a new observation that has all
> the data in it.
>
> I am getting a "type mismatch" error.
>
> Is this because some of my variables are string variables?
>
> How can i get around this?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/