Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: recursively search folder sub directories and store filenames in a text file
From
Tim Evans <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: RE: recursively search folder sub directories and store filenames in a text file
Date
Thu, 31 Oct 2013 09:46:07 +0000
Robert,
Thanks for all of your help. I eventually went down the route of saving the results in a log-file and then reading in the files and the code I used is below. I did try to take advantage of the datafile you helped create in your second suggestion, but I couldn't overcome the fact that I loaded the file to access the values (filenames) but at the same time having an empty dataset.
--BEGIN CODE--
clear all
cd "T:\Final"
cap program drop dirlist
program define dirlist
syntax, fromdir(string)
// list of all files in "`fromdir'"
local flist: dir "`fromdir'" files "*.csv"
foreach f of local flist {
dis "`fromdir'/`f'"
}
// recursively list directories in "`fromdir'"
local dlist: dir "`fromdir'" dirs "*"
foreach d of local dlist {
dirlist , fromdir("`fromdir'/`d'") `list'
}
end
log using filenames.log, replace
local cdir = "`c(pwd)'"
dirlist, fromdir("`cdir'")
log close
insheet using filenames.log
keep if regexm(v1, "^T") == 1 ///Clean log file of any rows not associated with a filename and path
rename v1 filename
outsheet using "T:\Final\final_txt.txt", nonames replace
clear all
file open myfile using "T:\Final\final_txt.txt", read
file read myfile line
insheet using `line', comma names
di as text `line'
save master_data, replace
clear
file read myfile line
while r(eof)==0 {
insheet using `line'
di as text `line'
save temp, replace
append using master_data, force
save master_data, replace
**save temp, replace
clear
file read myfile line
}
append using master_data
outsheet using "T:\Final\combined_data.csv", comma names replace
--END CODE--
Best wishes
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Robert Picard
Sent: 30 October 2013 15:18
To: [email protected]
Subject: Re: st: RE: recursively search folder sub directories and store filenames in a text file
If this is a one shot deal, I would have simply copied the output from the results window to a text file and processed the list from there.
Using a log file to capture the list is also simple. It does make sense however that a program that recursively lists files save the list to a dataset so here's a modified version that adds that capability. While I was at it, I added a -pattern()- option if you want to restrict the search.
Robert
* ----- begin example --------------------------- cap program drop dirlist program define dirlist
syntax , fromdir(string) save(string) ///
[pattern(string) replace append]
// get files in "`fromdir'" using pattern
if "`pattern'" == "" local pattern "*"
local flist: dir "`fromdir'" files "`pattern'"
qui {
// initialize dataset to use
if "`append'" != "" use "`save'", clear
else {
clear
gen fname = ""
}
// add files to the dataset
local i = _N
foreach f of local flist {
set obs `++i'
replace fname = "`fromdir'/`f'" in `i'
}
save "`save'", `replace'
}
// recursively list directories in "`fromdir'"
local dlist: dir "`fromdir'" dirs "*"
foreach d of local dlist {
dirlist , fromdir("`fromdir'/`d'") save(`save') ///
pattern("`pattern'") append replace
}
end
* start from the current directory
local cdir = "`c(pwd)'"
* list all files
dirlist, fromdir("`cdir'") save("allfiles.dta") replace
* list all Excel files
dirlist, fromdir("`cdir'") save("dofiles.dta") ///
pattern("*.xls") replace
* ----- end example -----------------------------
On Wed, Oct 30, 2013 at 6:16 AM, Tim Evans <[email protected]> wrote:
> Robert,
>
> Thank you very much, this does indeed seem to do the trick - I am impressed! What I would like to do is save the files I list into either a .dta file, or to a text file which I can then read into Stata. The aim then will be to run through each record and open the file.
>
> My only suggestion I have at the moment would be to open a log file and save this, although this might not be the best way of doing things. Do you have any advice?
>
> Bes wishes
>
> Tim
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Robert
> Picard
> Sent: 29 October 2013 13:45
> To: [email protected]
> Subject: Re: st: RE: recursively search folder sub directories and
> store filenames in a text file
>
> Here is a way, from an initial directory, to recursively list all files in Stata.
>
> Robert
>
> * ----- begin example --------------------------- cap program drop
> dirlist program define dirlist
>
> syntax , fromdir(string)
>
> // list of all files in "`fromdir'"
> local flist: dir "`fromdir'" files "*"
> foreach f of local flist {
> dis "`fromdir'/`f'"
> }
>
> // recursively list directories in "`fromdir'"
> local dlist: dir "`fromdir'" dirs "*"
> foreach d of local dlist {
> dirlist , fromdir("`fromdir'/`d'") `list'
> }
>
> end
>
> local cdir = "`c(pwd)'"
> dirlist, fromdir("`cdir'")
>
> * ----- end example -----------------------------
>
> On Tue, Oct 29, 2013 at 8:04 AM, Tim Evans <[email protected]> wrote:
>> Hi all,
>>
>> I am using Stata 11.2 and have a working directory called "T:\Projects\Final". In this folder I have a number of subfolders i.e. GEH_2013, SWB_2013 and within these I have for example GEH_COL and GEH_OGD. Within these folders I have a csv file.
>>
>> So folder structure looks like :
>>
>> T:\Projects\Final
>> T:\Projects\Final\GEH_2013
>> T:\Projects\Final\GEH_2013\GEH_COL
>> T:\Projects\Final\GEH_2013\GEH_COL\ GEH_COL_combined.csv
>> T:\Projects\Final\GEH_2013\GEH_OGD
>> T:\Projects\Final\GEH_2013\GEH_OGD\ GEH_OGD_combined.csv
>> T:\Projects\Final\SWB_2013
>> T:\Projects\Final\SWB_2013\SWB_COL
>> T:\Projects\Final\SWB_2013\SWB_COL\SWB_COL_combined.csv
>> T:\Projects\Final\SWB_2013\SWB_OGD
>> T:\Projects\Final\SWB_2013\SWB_OGD\SWB_OGD_combined.csv
>>
>>
>> What I am trying to do is ultimately identify the names of each csv file contained at the third level of sub-directory and append the csv files into one large file.
>>
>> I have taken a look at using the following:
>>
>> rcd, :! dir *.csv /a-d /b >filelist.txt
>>
>> but all this does is create a text file in each sub-directory with the name of the csv file in that directory - so for T:\Projects\Final I have an empty text file as no csv files here, but what I need is a single text file that contains the filename and path for each csv file contained within T:\Projects\Final.
>>
>> Once I have this, my aim is to use the filenames and paths stored in the text file and to combine each csv file into one file.
>>
>> If anyone has a more elegant method of appending all csv files that are stored within sub-directories of a folder then I'd be grateful to hear!
>>
>> Best wishes
>>
>> Tim
>>
>> *********************************************************************
>> *
>> **** The information contained in the EMail and any attachments is
>> confidential and intended solely and for the attention and use of the
>> named addressee(s). It may not be disclosed to any other person
>> without the express authority of Public Health England, or the
>> intended recipient, or both. If you are not the intended recipient,
>> you must not disclose, copy, distribute or retain this message or any
>> part of it. This footnote also confirms that this EMail has been
>> swept for computer viruses by Symantec.Cloud, but please re-sweep any
>> attachments before opening or saving. http://www.gov.uk/PHE
>> *********************************************************************
>> *
>> ****
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
> **********************************************************************
> **** The information contained in the EMail and any attachments is
> confidential and intended solely and for the attention and use of the
> named addressee(s). It may not be disclosed to any other person
> without the express authority of Public Health England, or the
> intended recipient, or both. If you are not the intended recipient,
> you must not disclose, copy, distribute or retain this message or any
> part of it. This footnote also confirms that this EMail has been swept
> for computer viruses by Symantec.Cloud, but please re-sweep any
> attachments before opening or saving. http://www.gov.uk/PHE
> **********************************************************************
> ****
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/