Alejandro Riano <[email protected]> writes,
> Dear Statalisters, i have two databases, the first one has monthly data by
> country like this:
>
> country year month var1
> Argentina 1989 1 x
> Argentina 1989 2 x
> . . . .
> . . . .
> Argentina 1999 12 x
>
> for some set of countries (103 in total) and i have a second database which
> has all the bilateral combinations among the countries that are included in
> the first database (on a YEARLY basis), like this:
>
> country1 country2 year var2
> Argentina Brazil 1989 x
> Argentina Brazil 1990 x
> . . . .
> . . . .
> Zambia Zimbabwe 1999 x
>
> My problem is that i want to merge the var1 of the first database into the
> second one and in that way trasform my second bilateral base from a yearly
> frecuency to a monthly one. [...]
Let's call the two datasets monthly.dta and combo.dta.
Step 0: Verify what Alejandro thinks is true is true
-----------------------------------------------------
. use monthly, clear
. sort country year month
. by country year month: assert _N==1
. use combo, clear
. sort country1 country2 year
. by country1 country2 year: assert _N==1
Step 1: Convert combo.dta to monthly format
--------------------------------------------
. use combo, clear
. expand 12 /* 12 months per year */
. sort country1 country2 year
. by country1 country2 year: gen month = _n
. save combom
At this step we have not really "converted" the data to monthly format, who
knows what the value of var2 was for in January for (Argentiny, Brazil, 1989)?
Maybe is was var2 (maybe var2 is "countries in South America"), maybe as
a first approximation is was var2/12 (maybe var2 is "total exports"), maybe
something else. I will leave that for Alejandro to figure out.
Step 2: Merge on country1
--------------------------
. use monthly
. rename country country1
. sort country1 year month
. save, replace
. use combom, clear
. sort country1 year month
. merge country1 year month using monthly, nokeep
. save inprocess
. drop _merge /* but only after checking it */
(Technical note: The -nokeep- option on merge produces the same result as if
we typed -merge- without the -nokeep- followed by -keep if _merge==1 |
_merge==3-.)
At this point, we just did a spread merge. In the master data set (the data
set in memory), we had multiple observations for each (country1, year, month).
We had one (country1, year, month) observation in the using data. The way
-merge- works, Stata spread the single observation (duplicated it) across all
the relevant (country1, year, month) observations.
Step 3: Merge on country2
--------------------------
Our datasets has pairs of country, so now we must do the merge on the other
country, too. We must remember to rename the var1 variables in monthly.dta so
that they have a names in our final dataset:
. use monthly, clear
. rename country1 country2
. rename var1 var1_2
. sort country2 year month
. save monthly2, replace
. use inprocess, clear
. sort country2 year month
. merge country2 year month using monthly, nokeep
. drop _merge /* but only after checking it */
Step 4: clean up
-----------------
. erase combom.dta
. erase monthly2.dta
. erase inprocess.dta
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/