Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Path to current .do file?
From
Jeph Herrin <[email protected]>
To
[email protected]
Subject
Re: st: Path to current .do file?
Date
Thu, 18 Jul 2013 14:11:06 -0400
This is one of those small features which I hope to see in every new version of Stata. I'm not sure why, since Stata
obsviously knows in any given state what the "current directory" is, that this cannot be stored in a system macro.
Anyway, in addition to the other suggestions here, one solution would be to store the "main" directory path in a macro
which all of the other do-files can access.
If you use a global macro, this will be accessible by all of your do-files by default, and the relative paths could
simply incorporate the macro to produce an absolute path.
I usually use a local macro for this, because I typically have a "header.do" file that I -include- at the top of every
dofile to set up macros that I want for that dofile.
hth,
Jeph
On 7/18/2013 11:34 AM, Robert Picard wrote:
You can also take a look at -project-, a program I wrote to manage my
workflow in Stata. Type in Stata's Command window:
net from http://robertpicard.com/stata
and click on -project- to see a description, help file, and install if desired.
With -project-, Stata's current directory is always aligned with the
directory that contains the currently running do-file. So do-files do
not need to specify full or relative paths to access files in the
do-file's directory. You can move a whole directory up or down within
the project's directory without having to edit file paths in do-files.
Within a do-file, -project- can return the name of the currently
running do-file as well as the path to the main project directory.
Also, -project- automatically creates log files for each do-file. Each
log file is suspended when running a nested do-file and resumed when
the nested do-file terminates.
The problem with large projects that evolve over long periods of time
is that you usually run do-files out of context because it is
impractical to rerun all do-files in the project at every run. It also
becomes more difficult to spot the effects of a change on the results
of downstream do-files. With -project-, you embed build directives in
each do-file that note which files are used and created by the
do-file. -project- remembers these dependencies and automatically
skips over do-files that haven't changed and that have no change in
their dependencies.
My biggest project so far contains 5678 files total (1.2GB ) with 1886
do-files and has been chugging along for more than 3 years. If I
change a do-file that does not affect anything downstream, then only
that do-file is run when the project is built. Sometimes I make what I
think is an innocent change and -project- rerun hundreds of do-files.
The most important feature of -project- is that it can check that ALL
results can be replicated. After a replication build, each dataset
created, each output file, each log file is checked against the
pre-build version and any difference is noted.
The are other useful features. See the package description, help file,
and demo project if interested.
Robert
On Wed, Jul 17, 2013 at 11:52 PM, James Beard <[email protected]> wrote:
Phil -
Thanks for your detailed, helpful and speedy reply.
I hadn't considered that it would be better to make all the .do files
run from the project root folder. Apart from this making the .do
files themselves easier to understand (because they wouldn't contain
folder references like ../../some_folder), it also ensures that they
can be made to immediately fail if someone tries to run them from the
wrong place (because the required sub-folders almost certainly won't
exist).
Having a master .do file would also make it easier to make sure
everything works smoothly.
Thanks again.
On 17 Jul 2013 at 21:38, Phil Schumm wrote:
On Jul 17, 2013, at 8:00 PM, James Beard <[email protected]> wrote:
In a Stata .do file (I'm using Stata 12 on Windows) is it possible to find out the path to the currently executing .do file?
I don't believe so.
I'm currently setting up some rather complicated data management in Stata, which will eventually have to deal with tens of thousands of files. Normally, I would put everything in the same folder, but in this case, that would become unmanageable. So, I have different folders for different sets of files. And I want to use relative paths to access them. If I was going to be running my .do files myself, I would just know that I have to start in the right place, but I can't guarantee that the people who will run them will do that. And I don't want to hard-wire paths in my .do files because the drive letters and paths to the "root" of my folder structure on the "production" system will be different from the root on my development system. So within each .do file, I want to -cd- to the folder in which each .do file is located, so the .do file can reliable locate files in other folders. With apologies to non-Windows users, you can do this sort of thing with "DOS" batch files, w!
it
h!
-cd/d %~dp0-, so I could provide a .bat file wrapper for each of my .do files, but this isn't an ideal solution, and wouldn't actually stop someone running one of my .do files directly from the wrong folder.
You can accomplish what you describe above without having to resort to absolute paths (which you are correct to avoid), without putting all of your files in a single directory (which, as you note, would be unmanageable), and without a do-file knowing its own location. To do this, start by creating a directory for the project, and within it whatever subdirectory structure you wish to organize your files. Then, whenever you refer to a file location in a do-file (e.g., reading in a dataset, writing a file, running another do-file, etc.), use a relative path from the root of the project directory. This way, any do-file is runnable from the root of the project, and there is no reason to be changing the working directory (which should always be set to the root of the project). Your project is self-contained and portable, and I think you'll find that the code is easier to maintain, since your working directory remains constant (e.g., you can move do-files around within the p
!
ro
j!
ect and they'll continue to work).
One more comment, since you mentioned making it easier for other people to run the do-files within your project. If you use the strategy I described above, instructions for executing the do-files in a project become as simple as, for example,
1. Launch Stata
2. -cd- to the root of the project
3. Type the following:
do data_management/clean_data
do data_management/build_analysis_file
do analysis/summaries
do analysis/fit_models
do analysis/plots
which can then be placed in a README.txt file at the root of the project. Alternatively, you can use a single do-file at the root of the project as a pseudo-Makefile, so that users can simply type
do make clean_data
do make analysis_file
do make summaries
do make models
do make plots
or even just
do make all
to do everything in one step.
-- Phil
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/