Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: regexm and boundary symbol \b

From   Jacob Wegelin <[email protected]>
To   [email protected]
Subject   st: regexm and boundary symbol \b
Date   Thu, 21 Dec 2006 18:16:07 -0800 (PST)

In many regular expression engines one can use the symbol \b to denote a
word boundary.  For instance, in unix, the following use of '\b' allows
us to select only those lines in a file that contain the letter 's'
where it stands alone, not next to any other letter.

UNIX> cat z
dogs and cats
s, he said
george's crown
UNIX> egrep 's' z
dogs and cats
s, he said
george's crown
UNIX> egrep '\bs\b' z
s, he said
george's crown

Is there a way to do this in Stata? The following attempt did not work:

. list

     | var1             var2 |
  1. |    1    dogs and cats |
  2. |    2              sss |
  3. |    3       s, he said |
  4. |    4   george's crown |

. list if regexm(var2, "s")

     | var1             var2 |
  1. |    1    dogs and cats |
  2. |    2              sss |
  3. |    3       s, he said |
  4. |    4   george's crown |

. list if regexm(var2, "\bs\b")

. list if regexm(var2, "\\bs\\b")

Thanks for any info

Jake Wegelin
*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index