Phil, that did the trick. Thanks a lot!
Best wishes,
Alexander
-----Opprinnelig melding-----
Fra: [email protected] [mailto:[email protected]] P� vegne av Phil Schumm
Sendt: 19. februar 2008 12:53
Til: [email protected]
Emne: Re: st: Dropping not valid emails
On Feb 19, 2008, at 5:10 AM, [email protected] wrote:
> I have a variable that contains e-mail adresses. I would like to find
> a way to drop observations that does not contain valid email adresses.
> There should be a condition that the string contains one '@', the
> exsistence of period, and two or three characters following the
> period, and no empty spaces.
This is best handled with a regular expression (i.e., with the -regexm
()- function). For example, the following regular expression will match most valid email addresses:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z][a-zA-Z][a-zA-Z]?[a-
zA-Z]?$
Unfortunately, Stata's lack of support for curly braces (used to indicate bounds) means that you cannot use the following, shorter
expression:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}$
which is essentially the same as the one above. But the first one will work fine with -regexm()-.
-- Phil
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/