Thank you, Nick. I had considered that tweak. However,
I need to collapse by (youthid), which then gives me a
value of allmissing for each youth, but does not tag
the missing group for replacement:
youthid 11 12
grp1dur 15 0
grp2dur 41 13
grp3ddur 0 0
grp4dur 0 42
grp5dur 0 0
grp6dur 0 55
allmissing 1 0
Melonie
--- n j cox <[email protected]> wrote:
> Below my signature is the key part of the posting
> referred to.
>
> You can tweak that approach by
>
> bysort youthid group (duration) : gen allmissing =
> missing(duration[1])
>
> and then -collapse- as before.
>
> Nick
> [email protected]
>
> OK. Consider this toy dataset:
>
> . l
>
> +------------------+
> | response group |
> |------------------|
> 1. | 1 1 |
> 2. | 2 1 |
> 3. | 3 2 |
> 4. | . 2 |
> 5. | 0 3 |
> |------------------|
> 6. | . 3 |
> 7. | . 4 |
> 8. | . 4 |
> +------------------+
>
> I'd regard 0 as a sensible sum for group
> 3 but not for group 4, but as you say
>
> . collapse (sum) response, by(group)
>
> . l
>
> +------------------+
> | group response |
> |------------------|
> 1. | 1 3 |
> 2. | 2 3 |
> 3. | 3 0 |
> 4. | 4 0 |
> +------------------+
>
> One way round this is to tag groups
> for which _all_ values are missing.
> One way to do that is
>
> . bysort group (response) : gen allmissing =
> mi(response[1])
>
> That is, within groups, sort on response. If the
> first
> value within each group is missing, then all of them
> in that group must be.
>
> The principle here is close to one explained at
> http://www.stata.com/support/faqs/data/diff.html
> You might also like to look at
> http://www.stata.com/support/faqs/data/anyall.html
>
> . l
>
> +-----------------------------+
> | response group allmis~g |
> |-----------------------------|
> 1. | 1 1 0 |
> 2. | 2 1 0 |
> 3. | 3 2 0 |
> 4. | . 2 0 |
> 5. | 0 3 0 |
> |-----------------------------|
> 6. | . 3 0 |
> 7. | . 4 1 |
> 8. | . 4 1 |
> +-----------------------------+
>
> Now we make sure the -collapse- carries along
> the important information:
>
> . collapse (sum) response (min) allmissing,
> by(group)
>
> . l
>
> +-----------------------------+
> | group response allmis~g |
> |-----------------------------|
> 1. | 1 3 0 |
> 2. | 2 3 0 |
> 3. | 3 0 0 |
> 4. | 4 0 1 |
> +-----------------------------+
>
> and convert back:
>
> . replace response = . if allmissing
> (1 real change made, 1 to missing)
>
> . l
>
> +-----------------------------+
> | group response allmis~g |
> |-----------------------------|
> 1. | 1 3 0 |
> 2. | 2 3 0 |
> 3. | 3 0 0 |
> 4. | 4 . 1 |
> +-----------------------------+
>
> Melonie Sullivan
>
> I have data on history of placements into different
> -group-s by -youthid-: there are multiple placement
> records for each youth. I need to create a variable
> equal to the sum of -duration- of all placements
> into
> each -group- for each youth. -collapse (sum)- seems
> to be
> the appropriate procedure, but it treats missing
> values as zeroes. This causes a problem if a youth
> has
> only one placement in a given group with unknown
> duration. Example:
>
>
> youthid group duration
> 11 1 15
> 11 1 .
> 11 2 31
> 11 2 10
> 11 5 .
> 12 2 5
> 12 2 8
> 12 4 42
> 12 6 55
>
> I create a duration variable for each group
> (-generate
> grp1dur = duration if group==1-, etc.) and -collapse
> (sum)- by -youthid- and I want to get this:
>
> youthid 11 12
> grp1dur 15 0
> grp2dur 41 13
> grp3ddur 0 0
> grp4dur 0 42
> grp5dur . 0
> grp6dur 0 55
>
> But collapse gives me a zero on grp5dur for youth
> #11,
> though youth #11 had placement in that group, albeit
> of an unknown duration. The other zeroes are
> correct;
> the youth had zero days in that placement group.
>
> The problem has been addressed here before, best in
> the following post by Nick Cox:
>
>
http://www.stata.com/statalist/archive/2004-07/msg00783.html
>
> However, this is not solving my particular problem,
> because my data essentially looks like a big stack
> of
> Nick's "toy datasets" -- one for each of 1800 youth
> in
> my data. So collapsing by (youthid) gives the same
> value of Nick's allmissing for each youth, since the
> allmissing tags missing durations for groups within
> youths.
>
>
> *
> * For searches and help try:
> *
> http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/