Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: filling in existing ids and generating new ids for unique actors
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: filling in existing ids and generating new ids for unique actors
Date
Fri, 7 Sep 2012 14:12:01 +0100
Comments embedded below.
Nick
On Fri, Sep 7, 2012 at 1:03 PM, Erik Aadland <[email protected]> wrote:
> Dear Statalist.
> I have an unbalanced panel dataset.
> The structure is as follows:
> year actor_id actor
> 2000 . Paul
> 2001 . Paul
> 2002 . Paul
> 2000 . Sarah
> 2001 1 Sarah
> 2002 1 Sarah
> 2000 . Simon
> 2001 2 Simon
> 2002 2 Simon
> I have 2 problems:
> 1. I want to fill in the missing existing actor_id for those actors that already have an actor_id in some years but not others.
That's
bysort actor (actor_id) : replace actor_id = actor_id[_n-1] if
missing(actor_id)
But follow by a check:
by actor : assert actor_id[1] == actor_id[_N]
For the principles, see
SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/02 SJ 2(1):86--102 (no commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N
> 2. I want to generate a new unique actor_id for those actors that have no actor_id in the dataset. This actor_id needs to be different from those already existing for other actors in the dataset.
> The variable -actor- lists the unique name for each actor and this unique name could be used as a basis for assigning the actor_id.
su actor_id, meanonly
local max = r(max)
egen new_actor_id = group(actor) if missing(actor_id)
replace actor_id = new_actor_id + `max' if missing(actor_id)
What this does:
1. Find the largest actor_id in use. So, it will be safe to use higher numbers.
2. Use -egen-'s -group()- to generate new ids to those without them.
These will run 1, 2, 3, ..
3. New actor_id = new id + maximum for those without them.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/