[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Encode/destring

From	Joseph Coveney <[email protected]>
To	Statalist <[email protected]>
Subject	Re: st: Encode/destring
Date	Sat, 25 Feb 2006 20:58:56 +0900

Giorgia Maffini wrote:

is anybody aware of how I could solve the following problem:
I am working with a panel of more than 70,000 firms.
When running FE and RE I need to specify the panel unit (firms in my
dataset). The panel unit has to be recorded a numeric variable, as I
understand.

In my data the firm idendifier is a STRING variable with both numbers and
letters. Example: firm with identifier FR12345 is different from firm with
identifier GB12345.

I used DESTRING-IGNORE but
1) it is difficult to track down all the characters present in the firm
identifier variable
2) Different firms will get the same id number. Example: FR12345 and
GB12345.

I used ENCODE but I got the following error message (134): You attempted to
encode a string variable that takes on more than 65,536 unique values.

--------------------------------------------------------------------------------

First, generate a numeric variable that takes the value one at the first
observation of a (sorted) panel unit, and zero at all succeeding
observations of that panel unit.  Then -sum()- the numeric variable across
the dataset.  The technique is illustrated below with dummy data of about
150 000 panel units.

Joseph Coveney

clear
set more off
set seed `=date("2006-02-25", "ymd")'
set obs 150000
generate str panel_unit = string(uniform(), "%19.18g")
*
* Begin here
*
bysort panel_unit: generate byte panel_number = _n == 1
replace panel_number = sum(panel_number)
exit

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Encode/destring
Next by Date: Re: st: Encode/destring
Previous by thread: st: Encode/destring
Next by thread: Re: st: Encode/destring
Index(es):
- Date
- Thread