If "A" and "X" are not the only alphabetic
characters, then
. list if missing(real(codeks))
will identify also these other alphabetical characters, which is not wanted.
Here is my final attempt of using my idea with regular expressions
(found1 is Nick's original solution, found2 is my solution, I hope):
. input str3 codeks
codeks
1. 101
2. 102
3. 01A
4. 01X
5. 0AX
6. EFG
7. end
. gen byte found1 = (strpos(codeks, "A") > 0) | (strpos(codeks, "X") > 0)
. gen byte found2 = regexm(codeks, ["A|X"] )
. list
+--------------------------+
| codeks found1 found2 |
|--------------------------|
1. | 101 0 0 |
2. | 102 0 0 |
3. | 01A 1 1 |
4. | 01X 1 1 |
5. | 0AX 1 1 |
|--------------------------|
6. | EFG 0 0 |
+--------------------------+
. list if missing(real(codeks))
+--------------------------+
| codeks found1 found2 |
|--------------------------|
3. | 01A 1 1 |
4. | 01X 1 1 |
5. | 0AX 1 1 |
6. | EFG 0 0 |
+--------------------------+
. list if found2
+--------------------------+
| codeks found1 found2 |
|--------------------------|
3. | 01A 1 1 |
4. | 01X 1 1 |
5. | 0AX 1 1 |
+--------------------------+
Anders
On Dec 14, 2007 12:05 PM, Anders Alexandersson <[email protected]> wrote:
> I prefer Gabi's and Nick's solution(s) too.
>
> Of course, regarding my own ideas, it is egen's rowtotal() function,
> not egen's total() function or generate's sum() function, that creates
> regular sums but that idea is moot by now. And to make bad things
> worse, from my perspective, I need to read up on how to use the
> logical operator in regular expressions in Stata.
>
> Thanks,
> Anders
>
>
> On Dec 14, 2007 11:40 AM, Nick Cox <[email protected]> wrote:
> > Yes, good idea. In so far as "A" and "X" are the only alphabetic
> > characters, then
> >
> > . list if missing(real(codeks))
> >
> > will identify any observations with "A", "X", "AX" within -codeks-,
> > and no others.
> > Note that no intermediate variable is needed for that purpose.
> >
> > Gabi Huiber
> >
> > This solution is not general, in that it assumes that all the codeks
> > values that do not include A, X, or AX are numeric strings. If that is
> > the case, Badi could simply do this:
> >
> > gen x=real(codeks)
> >
> > and x will show a missing value everywhere codeks shows A, X, or AX.
> >
> > Then it won't be too hard to say
> >
> > gen codeks_ax_dummy=(x==.)
> > drop x
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/