[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: string variable

From	"Austin Nichols" <[email protected]>
To	[email protected]
Subject	Re: st: string variable
Date	Tue, 13 Nov 2007 09:49:01 -0500

Nick--
There are several applications, e.g. -xtreg, i(id)-, where a numeric
id is required (for no apparent reason, but required nonetheless).
Why we cannot simply:
  egen g=grou(id)
and keep numeric and string identifiers is not clear, perhaps, but
suppose we want:
  list g
to produce correct-looking identifiers, for whatever reason.  Then the
idea of my posted approach is correct, though the details are
not--there is a missing -if- condition and -labmask- will not work
here.  But a solution from first principles is easy,  I think:

clear
loc N 500
set obs `N'
g id=string(_n)
replace id=id+char(_n) in 65/90
codebook id
*-encode- won't work if N too great
*encode id, gen(numid)
*(nor will -labmask- apparently)
gen numid=real(id)
gen strid=id if mi(numid)
egen g=group(strid)
su numid, meanonly
replace numid=r(max)+g if mi(num)
levelsof strid, loc(vals)
foreach v of loc vals {
 su numid if strid=="`v'", meanonly
 la def numid `r(max)' "`v'", modify
 }
la val numid numid
codebook numid


On 11/13/07, Nick Cox <[email protected]> wrote:
> Austin is right that -egen, group()- will assign integers
> 1 up. But if -encode- won't play at assigning labels because
> there are too many distinct values, then I don't think -labmask-
> (or even -egen, group()- with the -label- option) will help
> either.
>
> I am still puzzled at the original question. On the face of
> it the variable in question is some kind of identifier. It
> is difficult to see any sense in which it is better off as
> a numeric variable. If there are thousands of distinct values
> it would be no use for any kind of modelling, so far as I can imagine.
>
> Nick
> [email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: string variable
  - From: "Nick Cox" <[email protected]>

References:
- st: new package hangroot
  - From: Maarten buis <[email protected]>
- st: RE: new package hangroot
  - From: "Nick Cox" <[email protected]>
- st: Data transformation
  - From: "Lina Jonsson" <[email protected]>
- st: RE: Data transformation
  - From: "Nick Cox" <[email protected]>
- st: RE: RE: Data transformation
  - From: "Nick Cox" <[email protected]>
- st: SV: RE: RE: Data transformation
  - From: "Lina Jonsson" <[email protected]>
- st: RE: SV: RE: RE: Data transformation
  - From: "Nick Cox" <[email protected]>
- st: string variable
  - From: [email protected]
- Re: st: string variable
  - From: "Austin Nichols" <[email protected]>
- RE: st: string variable
  - From: "Nick Cox" <[email protected]>

Prev by Date: RE: st: Test for trend for SIR
Next by Date: RE: st: string variable
Previous by thread: RE: st: string variable
Next by thread: RE: st: string variable
Index(es):
- Date
- Thread