Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Extracting Data
From
Becker Stein <[email protected]>
To
[email protected]
Subject
st: Extracting Data
Date
Wed, 20 Nov 2013 16:38:46 -0500 (EST)
-----Original Message-----
From: Becker Stein <[email protected]>
To: statalist <[email protected].>
Sent: Wed, Nov 20, 2013 9:23 pm
Subject: Help Extracting Data
Hi,
I'm trying to extract data from a single string variable, and I was
wondering if how to create a regular expression that I can
use to do so. I've tried to create one just to extract the school
name, but to no avail. My data is set up as: [school district] name of
school (name of principle, name of assistant principle (*if any))
school type. Below are some examples.
[Meadowfield] Park Square (Susan Sims, John Riley) Middle School
[Somerset] Upton & Pride Day School (Judith Taper) Elementary School
[Temperly] Lakewood School (Jason Stevenson, Jill Harris ) K-12
[Packard] W.E.B. Du Bois ( Robert Williams, Jr.) Middle School
I would like to extract the school name, principle name and asst.
principle name as separate variables. Sometimes the names have special
characters such as an "&" (as in the case of Upton & Pride) or a ".".,
and the administrators section may have only have 1 name or 2 names
(separated by a comma). Also, some of the data in the brackets and
parentheses have extra spaces. I initially used the itrim function on
the variable, and it removed the extra spaces for the content outside
of the brackets and
parentheses (i.e., school name and school type), but it didn't work for
content inside of them (school district and principal names).
Thanks in advance for any/all help.
Best,
Becker
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/