It can be done.
Here is one solution. It is easiest if you know
in advance roughly the maximum number of "others"
any person might have.
Warning: untested code ahead.
Guess this number, and then add some. Suppose
you guess 20, and then add 10. You get 30
forval i = 1/30 {
gen Other`i' = ""
}
levelsof Person, local(Persons)
qui foreach P of local Persons {
levelsof City if Person == "`P'", local(Cities)
local which
foreach C of local Cities {
levels Person if Person != "`P'" & City == "`C'", ///
local(work) clean
local which : list which | work
}
noi di "`P': `which'"
local nothers : word count `which'
tokenize `which'
forval i = 1/`nothers' {
replace Other`i' = "``i''" if Person == "`P'"
}
}
Then clean up any empty variables:
forval j = 30(-1)1 {
assert Other`i' == ""
drop Other`i'
}
This loop is designed to fail at the first
Other? variable that is not all empty. It
will drop in turn Other30, Other29, ... if
and only if it is all empty.
Alternatively, -dropmiss- from STB-60 can be used.
30 is just pulled out of the air. Your number will differ.
Nick
[email protected]
Anna Lehman
> Going back to your suggestion,
> If the number of observations is large and the information
> does not fit into
> a string variable,
> is there any way I can still store the obtained information?
> For example, since for person 1, the list is: 2, 5 4 and 8,
> Others would contain a string with "2 5 4 8". That is fine.
> The problem is
> that if the list has many numbers they won't fit into the
> variable "Others".
> Can I store the different numbers (2,5, 4 and 8) in different
> columns/variables (instead of creating the variable Others)?
> This is the
> only way I can think of dealing with a large number of
> observations but I'm
> not sure how to operationalize it... Any suggestions?
> Thanks for your help,
> Anna
>
> >From: n j cox <[email protected]>
> >Reply-To: [email protected]
> >To: [email protected]
> >Subject: Re:st: reorganizing data
> >Date: Mon, 04 Sep 2006 15:25:01 +0100
> >
> >This should work with toy datasets. If your identifiers are
> >long, or your number of observations is large, the information
> >won't fit into a string variable, so the lines mentioning
> >"Others" should be deleted.
> >
> >gen Others = ""
> >levelsof Person, local(Persons)
> >qui foreach P of local Persons {
> > levelsof City if Person == "`P'", local(Cities)
> > local which
> > foreach C of local Cities {
> > levels Person if Person != "`P'" & City ==
> "`C'", /// local(work)
> >clean
> > local which : list which | work
> > }
> > noi di "`P': `which'"
> > replace Others = "`which'" if Person == "`P'"
> >}
> >
> >Nick
> >[email protected]
> >
> >Anna Lehman
> >
> >I have a dataset with the following structure:
> >
> >City Person_id
> >A 1
> >A 2
> >B 1
> >B 5
> >C 1
> >C 5
> >C 4
> >D 8
> >D 1
> >
> >I would like to obtain the following:
> >for each and every person, a list with the people that have
> apartments in
> >the same city (independently of which city). For example,
> for person 1,
> >this
> >list would be: 2, 5 4 and 8. And for person 5 the list would be: 1 .
> >
> >Can you think of a relatively easy way of acomplishing this?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/