|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
-by:- is sweet [was: Re: Re: st: Creating a new variable with information from other observations]
From |
n j cox <[email protected]> |
To |
[email protected] |
Subject |
-by:- is sweet [was: Re: Re: st: Creating a new variable with information from other observations] |
Date |
Mon, 19 May 2008 15:48:03 +0100 |
.
A more mundane solution uses -by:-.
gen is_capital = capitalid == cityid
bysort countryid (is_capital) : gen latitude_capital = latitude[_N]
The indicator (dummy) is 1 when a city is the capital and 0 otherwise.
If you sort each capital city to the end of the block of observations
for a country, then you can just pick up its value for the new variable.
A more cautious approach slaps an extra condition on the second statement
& is_capital[_N]
So, no loops necessary at all. Or, more precisely, Stata does the loop
required automatically as a consequence of -by:-.
The following poem [by one Sta Ta?] fell into my hands recently.
Something to repeat?
Seek a method neat.
Loops are lovely,
-by:- is sweet.
The style leaves much to be desired, but the content is good.
Nick
[email protected]
Teresio Poggio
from your dataset I'd build a just capitals dataset:
- select just the capitals (drop if cityid !=capitalid)
- in the new dataset keep just capitalid and latitude
- rename latitude into latitude_capital
- sort the data by capitalid and save it
then open you original data set and sort it by capitalid,
merge it with the new "just capital dataset" using capitalid as a key
and the option uniqmaster
(help merge for details)
Davide Cantoni
> I am having a rather intricate problem in creating a new variable in a
> panel dataset, and I appreciate any help you could offer. I hope the
> problem can potentially be of general interest.
>
> I have a panel dataset of cities and their characteristics in
> different countries. I know the latitude of each one of these cities,
> but now I want to create an additional variable reflecting the
> latitude of the capital city of the country a given city lies in. So
> for example: for the cities of New York, Chicago, etc., I want this
> new variable to contain the latitude of Washington, DC.
>
> Here is a description of the dataset's structure: it is a panel in
> long form, with cities in different countries, observed over different
> years. Each city has a unique numeric identifier, "cityid". Then there
> is a country identifier, called "countryid". Finally, there is a
> variable that repeats the capital city's cityid for each city in a
> given country, "capitalid". For instance, if the cityid of London was
> 135, all cities in the dataset that are in the UK would get a value of
> 135 in the variable "capitalid". Finally, there is a variable called
> "latitude" that refelcts the latitude of each city.
>
> How would I now proceed to create this new variable, call it
> "latitude_capital", by using the variables above?
>
> Basically, the problem I'm having is
> - tell stata to look up for each city its capitalid
> - browse the dataset until you find a city that has the cityid equal
> to this capitalid
> - find out the latitude of this capital city
> - go back to the original city and replace "latitude_capital" with the
> latitude you've just retrieved
>
> The additional problem I encounter while trying to construct something
> with "foreach..." (that, at least, is what I was trying so far) is
> that the values that the capitalid variable takes are of course not a
> clean numlist (like "1(1)100"), but rather a sequence of numbers
> without any regularity, such as 11 12 50 54 60 131... and so on.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/