All,
I asked the following question on Friday and received great help
which provides solution to my problem. Michael Blasnik, Austin Nichols,
and Jens Lauritsen actually wrote the program for me in different
flavors. I summarize the solutions below if anyone needs help on a
similar problem. I'm grateful to all of you.
My Question:
I have a list of text files in a folder which becomes available daily. I
would like to access these daily files by date in my program. Is it
possible in stata to identify the datasets created by date? My dataset
are named like the following:
Name Date Modified
PARCEL.G1911V00 03/29/05
PARCEL.G1921V00 03/29/05
PARCEL.G1914V00 03/30/05
and so on.
In my program, I would only like to capture PARCEL.G1911V00 and
PARCEL.G1921V00 since they are created on the same date (03/29/05) and
ignore PARCEL.G1914V00 which I would like to capture on 03/30/05's run.
Nick Cox:
I don't know an easy way to do this
from within Stata.
The details depending on what OS you are using, you could route the
results of a -dir- or -ls- to a file and process that.
It may well be easier to you to write a script in your favourite
scripting language (Perl, Python, Awk, whatever) to ensure that
files are renamed so that the names show dates more transparently,
and then to read into Stata files satisfying a given pattern.
Naturally, your set-up may prohibit that.
Alternatively, you could write something to
work out the difference between the files
there now and last time you looked.
Austin Nichols's program:
prog def gettoday, rclass
preserve
cap drop _all
!del tempdir.txt
!dir > tempdir.txt
infix str date 1-10 str time 11-19 str size 20-38 str fname 39-80 using
tempdir.txt gen fdate=date(date,"mdy") keep if
fdate==date(c(current_date), "dmy") keep if fname!="." keep if
fname!=".." forval i=1/`=_N' {
if "`: di fname[`i']'"!="tempdir.txt" {
local names="`names' `: di fname[`i']'"
}
}
restore
if "`names'"=="" {
return local names "."
}
else {
return local names "`names'"
}
end
then type
. qui gettoday
. ret li
and the list of today's file names will be visible in r(names).
Michael Blasnik's program:
program define dayfiles, rclass
version 8.2
syntax [, Filespec(str) IFDate(str)]
preserve
drop _all
if "`filespec'"=="" local filespec "*.*"
!dir `filespec' > mydir.txt
quietly{
infix str x 1-80 using mydir.txt
split x
gen int date=date(x1,"mdy",2040)
drop if date==.
rename x5 filename
keep date filename
if "`ifdate'"!="" keep if date==`ifdate'
forval i =1/`=_N-1' {
local f `"`f'`"`=filename[`i']'"' "'
}
if _N>0 local f `"`f'`"`=filename[_N]'"'"'
}
return local files `"`f'"'
end
Morten Andersen's program:
-------------------------------- BEGIN dirlist.ado
-------------------------
*! 1.3 MA 2004-10-06 23:56:48
* saves directory data in r() macros fnames, fdates, ftimes, fsizes,
nfiles
* used by dodoc.ado
*--------+---------+---------+---------+---------+---------+---------+--
----
program define dirlist, rclass
version 8
syntax anything
tempfile dirlist
if "`c(os)'" == "Windows" {
local shellcmd `"dir `anything' > `dirlist'"'
}
if "`c(os)'" == "MacOSX" {
local anything = subinstr(`"`anything'"', `"""', "", .)
local shellcmd `"ls -lT `anything' > `dirlist'"'
}
if "`c(os)'" == "Unix" {
local anything = subinstr(`"`anything'"', `"""', "", .)
local shellcmd `"ls -l --time-style='+%Y-%m-%d %H:%M:%S'"'
local shellcmd `"`shellcmd' `anything' > `dirlist'"'
}
quietly shell `shellcmd'
* read directory data from temporary file
tempname fh
file open `fh' using "`dirlist'", text read
file read `fh' line
local nfiles = 0
local curdate = date("`c(current_date)'","dmy")
local curyear = substr("`c(current_date)'",-4,4)
while r(eof)==0 {
if `"`line'"' ~= "" & substr(`"`line'"',1,1) ~= " " {
* read name and data for each file
if "`c(os)'" == "MacOSX" {
local fsize : word 5 of `line'
local fda : word 6 of `line'
local fmo : word 7 of `line'
local ftime : word 8 of `line'
local fyr : word 9 of `line'
local fname : word 10 of `line'
local fdate = ///
string(date("`fmo' `fda' `fyr'","mdy"),"%dCY-N-D")
}
if "`c(os)'" == "Unix" {
local fsize : word 5 of `line'
local fdate : word 6 of `line'
local ftime : word 7 of `line'
local fname : word 8 of `line'
}
if "`c(os)'" == "Windows" {
local fdate : word 1 of `line'
local ftime : word 2 of `line'
local word3 : word 3 of `line'
if upper("`word3'")=="AM" | upper("`word3'")=="PM" {
local ftime "`ftime'-`word3'"
local fsize : word 4 of `line'
local fname : word 5 of `line'
}
else {
local fsize : word 3 of `line'
local fname : word 4 of `line'
}
}
local fnames "`fnames' `fname'"
local fdates "`fdates' `fdate'"
local ftimes "`ftimes' `ftime'"
local fsizes "`fsizes' `fsize'"
local nfiles = `nfiles' + 1
}
file read `fh' line
}
file close `fh'
return local fnames `fnames'
return local fdates `fdates'
return local ftimes `ftimes'
return local fsizes `fsizes'
return local nfiles `nfiles'
end
* end
-------------------------------- END dirlist.ado
-------------------------
-------------------------------- BEGIN dirlist.hlp
-------------------------
{smcl}
{* 2004-03-12 15:57:14}{...}
{hline}
help for {hi:dirlist} {right: (version 1.3, 2004-10-06)}
{hline}
{title:Retrieve directory information}
{p 4 13 2}{cmd:dirlist} [{it:filespec}]
{title:Description}
{p 4 4 2}
{cmd:dirlist} is used as the {cmd:dir} command, but retrieves the
information
about files in in return macros (see below).
{p 4 4 2}
{it:filespec} may be any valid Windows, Unix, or Macintosh file path or
file
specification (see {hi:[U] 14.6 File-naming conventions}) and may
include
"{cmd:*}" to indicate any string of characters.
{p 4 4 2}
Directory data are written to a temporary file using shell commands
(Windows {cmd:dir} and Mac OS X or Unix {cmd:ls}) and subsequently read
by
the program.
{p 4 4 2}
Mac OS X: Spaces in the {it:filespec} should be preceded by an escape
character "{cmd:\}".
{title:Examples}
{p 4 8 2}
{cmd:. dirlist dm50*.do}
{p 4 4 2}
You can then access the returned results:
{p 4 4 2}
{cmd:. return list}
macros:
r(nfiles) : "4"
r(fsizes) : "814 209 296 493"
r(ftimes) : "13:27:15 13:29:05 12:22:01 13:41:09"
r(fdates) : "2003-10-30 2003-10-30 2003-10-30 2003-10-30"
r(fnames) : "dm501.do dm502.do dm503.do dm504.do"
{p 4 8 2}
{cmd:. dirlist ~/DM\ data/dm50*.do} {it:(Mac OS X, space in directory
name)}
{title:Note}
{p 4 4 2}
The ado-file has been tested on Mac OS X, Windows XP and one type of
Linux.
Possible problems could occur caused by the layout of directory lists
regarding column arrangement and date format.
{title:Author}
{p 4 4 2}
Morten Andersen, Research Unit for General Practice{break}
University of Southern Denmark, Denmark{break}
[email protected]
{title:Also see}
{p 4 13 2}
Online: help for
{help dir},
{help shell},
{help return}
{p_end}
-------------------------------- END dirlist.hlp
-------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/