gen ngrp1 = wordcount(subinstr(grp1, ";", " ", .))
etc.
Nick
[email protected]
Philip Ryan
> Another possible solution makes use of the -noccur()-
> function written by
> Nick Winter for the -egenmore- set of user -written add-ons
> to -egen- (see
> findit egenmore).
>
> Here I assume you want the number of elements in each
> variable, not the
> number of separators (semi-colons). In general the number of
> elements will
> be one more than the number of separators:
>
> forvalues i = 1/4 {
> egen k`i' = noccur(grp`i'), string(";")
> replace k`i' = k`i' + 1
> }
>
> You would need to be careful that the separators are indeed
> all semi-colons
> (and not a mixture of semi-colons and commas such as you show in your
> message) and that there were no additional semi-colons -
> doubling up or
> trailing ones, for example.
Scott Merryman
> >In your example, isn't the number of semicolons: 2, 2, 0, 0 ?
> >
> >Or, do you mean something like this?
> >
> >forv i = 1/4 {
> > qui gen gr`i' = .
> >}
> >levelsof id, local(levels)
> >foreach l of loca levels {
> > local i = 1
> > foreach v of varlist grp* {
> > qui split `v' if id == `l', p(;) gen(_split)
> > qui replace gr`i' = `=r(nvars)' if id == `l'
> > drop _split*
> > local ++i
> > }
> >}
> >
> >
> >For example:
> >
> >
> >. l, noobs
> >
> > +----------------------------------------------+
> > | id grp1 grp2 grp3 grp4 |
> > |----------------------------------------------|
> > | 1 2;3;4 10;99;2 01 11;2;25;2;3 |
> > | 2 2;3 10;99;2;44 01 11;2;25;2 |
> > +----------------------------------------------+
> >
> >. forv i = 1/4 {
> > 2. qui gen gr`i' = .
> > 3. }
> >
> >. levelsof id, local(levels)
> >1 2
> >
> >. foreach l of loca levels {
> > 2. local i = 1
> > 3. foreach v of varlist grp* {
> > 4. qui split `v' if id == `l', p(;) gen(_split)
> > 5. qui replace gr`i' = `=r(nvars)' if id == `l'
> > 6. drop _split*
> > 7. local ++i
> > 8. }
> > 9. }
> >
> >. l,noobs
> >
> >
> +-------------------------------------------------------------
> ---------+
> > | id grp1 grp2 grp3 grp4 gr1
> gr2 gr3 gr4 |
> >
> |-------------------------------------------------------------
> ---------|
> > | 1 2;3;4 10;99;2 01 11;2;25;2;3 3
> 3 1 5 |
> > | 2 2;3 10;99;2;44 01 11;2;25;2 2
> 4 1 4 |
> >
> +-------------------------------------------------------------
> ---------+
Alexander Nervedi
> > > I have data which has been entered awkwardly.
> > >
> > > Instead of taking each a seperate variable for each item
> - all items of a
> > > category are entered together in a variable.
> > >
> > > ID Grp1 Grp2 Grp3 Grp4
> > > 001 2;3;4 10;99;2 01 11,2,25,2,3
> > >
> > >
> > > I'd like to convert this to a dataset that looks like
> > >
> > > ID Grp1 Grp2 Grp3 Grp4
> > > 001 3 3 1 5
> > >
> > > i.e. the count of the number of semi-colons within each
> variable. I am
> > > sure
> > > there is a neat way of doing this but I am missing it. So
> i thought i'd
> > > write in and ask for u r help.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/