Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Routine for merging many text-files
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Routine for merging many text-files
Date
Fri, 15 Mar 2013 11:43:41 +0000
I don't know why you are trying to -merge- here. It sounds more like a
job for -append-. Just as you need a "long" structure for panel data,
so also new individuals and/or years need to be added in new
_observations_.
The help for -append- explains how to -append- several files at once.
So, it sounds to me like a loop over textfiles applying -reshape- and
-save- followed by an -append-.
-fs- (SSC) is a convenience program for putting a set of filenames in
a list, usually as a preliminary to a -foreach- loop.
Nick
On Fri, Mar 15, 2013 at 11:30 AM, Simon Falck <[email protected]> wrote:
> Nick,
>
> Something with the format of my last email changed the line of codes, which supposed to look like this,
>
> *Insert dataset 1 from textfile and save as .dta
> insheet using "C:\User\dataset1", tab
> reshape long y, i(id) j(year)
> rename y var1
> save "C:\User\dataset1.dta"
>
> *Insert dataset 2 from textfile and save as .dta
> insheet using "C:\User\dataset2", tab
> reshape long y, i(id) j(year)
> rename y var2
> save "C:\User\dataset2.dta
>
> *Merge dataset 1 and 2 to key-file containing joint id=B4s.
> use "C:\User\id.dta"
> merge 1:m id year using "C:\User\dataset1.dta"
> drop _merge
> merge 1:m id year using "C:\User\dataset2.dta"
> drop _merge
>
> The -year- variable is thus created through the reshape procedure.
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: den 15 mars 2013 12:21
> To: [email protected]
> Subject: Re: st: Routine for merging many text-files
>
> This is not clear on how you get a -year- variable out of your datasets.
>
> Nick
>
> On Fri, Mar 15, 2013 at 11:02 AM, Simon Falck <[email protected]> wrote:
>
>> I have about 100 text files that I want to merge in to a longitudinal datab= ase and wonder if there is a routine that I can apply to ease this work.
>>
>> The text-files have different periods but the same datastructure, which is = a wide format that looks something like this,
>>
>> Id y1 y2 y3
>> 1 10 30 40
>> 2 11 31 41
>> ...and so on..
>>
>> If I would apply the standard procedure to create the database the script w= ould look something like this for two text-files, (as an example but would = look the same for the 100 text-files)
>>
>> *Insert dataset 1 from textfile and save as .dta insheet using "C:\User\dataset1", tab reshape long y, i(id) j(year) rename y var1 save "C:\User\dataset1.dta"
>>
>> *Insert dataset 2 from textfile and save as .dta insheet using "C:\User\dataset2", tab reshape long y, i(id) j(year) rename y var2 save "C:\User\dataset2.dta"
>>
>> *Merge dataset 1 and 2 to key-file containing joint id=B4s.
>> use "C:\User\id.dta"
>> merge 1:m id year using "C:\User\dataset1.dta"
>> drop _merge
>> merge 1:m id year using "C:\User\dataset2.dta"
>> drop _merge
>>
>> The resulting database would look something like this,
>>
>> Id year var1 var2
>> 1 1 10 100
>> 1 2 15 75
>> 1 3 20 65
>> 2 1 11 112
>> 2 2 17 80
>> 3 1 36 110
>> ...and so on..
>>
>> Since I have about 100 text-files that (1) needs to be converted into .dta,=
>> (2) reshaped into a long-format, (3) rename variables, and (4) merged into= a joint database, I wonder if someone know how I could write a routine to = ease this work which otherwise is repetitive and result in a very large .do= file.
> *
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/