Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Nick Cox" <n.j.cox@durham.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: Data Extraction from Cells |
Date | Mon, 12 Jul 2010 14:56:57 +0100 |
"cell" here appears to mean value of a string variable in an observation. "column" here appears to mean variable. Stata is not a spreadsheet! -substr()- is the appropriate function. See -help functions- and then look under string functions. gen part1 = substr(whatever, 1, 2) gen part2 = substr(whatever, 3, 2) gen part3 = substr(whatever, 5, 2) Nick n.j.cox@durham.ac.uk Samuel Finkelstein I am currently using a dataset that includes multiple indicators within the same cell. For instance, if a firm is publicly traded, was incorporated in the last five years, and is in the utilities industry, a single cell may contain information such as 1A3B3N, with each number/letter combination denoting one of these characteristics (with no spaces in between the various combinations). Is there a way that I can extract each of the number/letter cominations into different (newly created) columns, such that I have one column that has 1A, one column with 3B, and one with 3N? For my purposes, the columns can contain different codes/identifiers. For example, the first column could have 1A for Company A and 2F for Company B. I am only trying to use these codes to identify firms with two or three specific characteristics. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/