Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: How to reference results from a big dataset within a program
From
"Chen,Minxing" <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: How to reference results from a big dataset within a program
Date
Wed, 28 Aug 2013 16:58:53 +0000
Thank you Phil and Christopher for the very valuable suggestions!
-- Richard, I totally agree with you, we do learn new things everyday, even for those that we thought we alreadyknew a lot. I didn't expect that my single email will generate such many helps, this proved again that Statalist community is such as an excellent and responsible one.
Minxing
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Richard Williams
Sent: Wednesday, August 28, 2013 8:26 AM
To: [email protected]; [email protected]
Subject: Re: st: How to reference results from a big dataset within a program
At 06:06 AM 8/28/2013, Phil Schumm wrote:
>On Aug 27, 2013, at 4:25 PM, "Chen,Minxing" <[email protected]> wrote:
> > Basically, in the program I submitted, I had to reference results
> from a big pre-simulated dataset (four variables, but around
> 400,000 observations). In my previous submission, I simply submitted
> the pre-simulated dataset with my program, and within the program I
> called up that simulated dataset by using code such as "
> use c:\ado\personal\simudata". I was hoping when people download the
> program from SSC, the pre-simulated dataset will be also downloaded to
> the directory "c:\ado\personal\".
> >
> > Now my reviewer indicated that I can't expect users to do that, I
> can't even tell the user to place the file there because such a
> directory may not be creatable for the user (e.g. they might not have
> a C: drive). The reviewer suggested me to find some other way to get
> the information in my pre-simulated dataset, such as incorporating the
> data into the program.
> >
> > I tried to copy of the simulated data within my program by using
> syntax such as "input x y z k", however, since there are so many
> observations (a little more than 400,000), and there are system limit
> for the maximum lines of syntax within a program (around 3500), I was
> not able to do this way. The reviewer also mentioned that I may use
> "Mata library" function, but I am pretty new to Stata Mata. Is there
> anyone that may be able to help regarding this issue?
>
>
>Basically you have two options. The first would be to deliver the
>dataset (i.e., .dta file) automatically along with the package. See
>-help usersite- or [R] net for the complete details, but essentially
>you'll want to use "F mydata.dta" rather than "f mydata.dta" to force
>the dataset to be installed in the system directories rather than the
>user's current working directory. You then call the dataset with
>
> sysuse mydata
>
>This way, everything will "just work" regardless of the user's local
>setup, and users don't need to know (or worry) about where the file is
>located. This also makes it easy for you to update the file at a later
>date, if necessary.
>
>The alternative would be to place the dataset on the web somewhere, and
>access it from within your code using the URL. The downside to this is
>that your command won't work unless the user has an internet
>connection, which would be annoying.
You learn something new every day. I would add that (a) give the data set a name that is somewhat esoteric and unlikely to be otherwise used, and (b) give it a name that will associate it with the program so that people don't wonder where it came from, e.g. myprog_data. Of course, I would make the same advice for all the files that will be installed.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/