Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: dynamic line execution in mata |
Date | Mon, 10 Feb 2014 19:39:49 +0000 |
Phil Schumm pointed to associative arrays in a later answer. I wrote -savesome- (SSC) to what you want to do. It's not fast. Its original version from 2001 long predates Mata. It's a (lousy) benchmark for you. I wouldn't call Mata line by line but that in itself is probably trivial. Nick njcoxstata@gmail.com On 10 February 2014 19:22, Andrew Maurer <Andrew.Maurer@qrm.com> wrote: > Thanks for the response, Nick. I looked into pointers and have been able to make use of them. I'll give the background of the problem. I would be very interested to hear if anyone has thoughts on the efficiency of the code I have so far (see bottom of post). > > I am writing a Stata program, saveif, that will save a subset of observations of a dataset to a file. One method to accomplish this would be to do something like: > preserve > keep if... > save... > restore > > However, for large datasets (eg 20gb) and few observations to be saved (eg - a few mb of outliers), I expect that the preserve/restore method is grossly inefficient, since it involves writing the entire dataset from memory to hard-disk, then reading it back from hard-disk to memory. > > An alternative method to accomplish the task would be to somewhat manually "file write" the individual observations to a file, without having to clear and load back the dataset from memory. I have a nearly complete example here, where there is one part that has been hard-coded to the specific example of gnp96.dta. The code is still somewhat rough. > > Thank you, > Andrew Maurer > > > > * Want to write a program that will save a set of observations into a dataset > mata: mata clear > clear all > > cap program drop saveif > program define saveif > syntax varlist [if] [in] using/, [replace] > putmata `varlist' `if' `in', view > > * put a row vector called varnames to mata > forval i = 1/`: word count `varlist'' { > if `i' == 1 { > mata: varnames = "`: word `i' of `varlist''" > mata: vartypes = "`: type `: word `i' of `varlist'''" > mata: varpointers = &`: word `i' of `varlist'' // pointers > } > else { > mata: varnames = varnames,"`: word `i' of `varlist''" > mata: vartypes = vartypes,"`: type `: word `i' of `varlist'''" > mata: varpointers = varpointers,&`: word `i' of `varlist'' // pointers > } > } > * save vector of varnames to file > cap confirm new file "`using'" > if _rc != 0 { > di as error "`using' exists. replacing" > rm "`using'" > } > mata: fh = fopen("`using'", "w") > mata: fputmatrix(fh, varnames) > mata: fputmatrix(fh, vartypes) > mata: fputmatrix(fh, varpointers) > > * write observations of each variable to file > forval i = 1/`: word count `varlist'' { > mata: fputmatrix(fh, `: word `i' of `varlist'') > } > > mata: fclose(fh) > end > > > capture mata mata drop recover_from_saveif() > mata: > void recover_from_saveif(string fileloc) > { > > fh = fopen(fileloc, "r") > varnames = fgetmatrix(fh) > vartypes = fgetmatrix(fh) > varpointers = fgetmatrix(fh) > // ----- hard coded part!! try to get this into loop > date = fgetmatrix(fh) > gnp96 = fgetmatrix(fh) > // ------------------------------------------------- > fclose(fh) > varcount = cols(varnames) > > // ------- this loop not working yet. need to figure out syntax > // foreach var of varnames, read var from file to mata > for (i=1; i<=varcount;i++) { > // varnames[1,i] = fgetmatrix(fh) > } > // ------------------------------------------------- > > // foreach var of varnames, load var into stata with correct variable type > for (i=1; i<=varcount;i++) { > thisvarname = varnames[1,i] // eg contains "date" > thisvartype = vartypes[1,i] // eg contains "int" > thisvar = varpointers[1,i] // eg pointer to date vector > if (i == 1) st_addobs(rows(*thisvar)) > st_store(., st_addvar(thisvartype,thisvarname),*thisvar) > } > > } > end > > cap program drop recover_from_saveif > program define recover_from_saveif > syntax using/, [replace] > > mata: recover_from_saveif("`using'") > > end > > > sysuse gnp96.dta, clear > > saveif * in 1/5 using test5.txt > > clear > recover_from_saveif using test5.txt > > > > > > > > > > > > > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox > Sent: Monday, February 10, 2014 11:59 AM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: dynamic line execution in mata > > You are presuming that such a thing exists. > > In essence, Mata has no direct equivalent of macro substitution. > > Sometimes, the way to solve (similar) problems is by direct manipulation of strings. That is the theme of > > SJ-11-2 pr0052 . . . . Stata tip 100: Mata and the case of the missing macros > . . . . . . . . . . . . . . . . . . . . . . . . W. Gould and N. J. Cox > Q2/11 SJ 11(2):323--324 (no commands) > tip showing how to do the equivalent of Stata's macro > substitution in Mata > > Sometimes, using pointers is the answer. > > In this case, I'd guess that you want the Mata equivalent of some Stata operation and that there's a Mata way of doing it, but I would rather hear whether that is so than try to guess what the underlying problem is. > > Nick > njcoxstata@gmail.com > > > On 10 February 2014 17:38, Andrew Maurer <Andrew.Maurer@qrm.com> wrote: >> Hi Statalist, >> >> I am trying to find Mata's equivalent of Stata's macro expansion functionality. In the below example, I first define an object thisvar as the string "date" and I define the object date as the column vector 1 \ 2 \ 3 \ 4 \ 5. How can I return the contents of the "date" object by only referencing "thisvar"? >> >> In the line, rows( thisvar ), thisvar is simply the 1x1 matrix containing the string "date", so rows( thisvar ) returns: 1. What I am looking for is something like rows( `=thisvar' ), so as to return 5 rather than 1. >> >> ********* begin example ********* >> >> mata >> >> i = 1 >> date = 1 \ 2 \ 3 \ 4 \ 5 >> varnames = "date", "price" >> thisvar = varnames[1,i] >> rows( thisvar ) // output: 1 >> rows( date ) // output: 5 >> >> end >> >> ********* end example *********** >> >> Thank you, >> Andrew Maurer >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/