Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: dynamic line execution in mata
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: dynamic line execution in mata
Date
Mon, 10 Feb 2014 19:39:49 +0000
Phil Schumm pointed to associative arrays in a later answer.
I wrote -savesome- (SSC) to what you want to do. It's not fast. Its
original version from 2001 long predates Mata. It's a (lousy)
benchmark for you.
I wouldn't call Mata line by line but that in itself is probably trivial.
Nick
[email protected]
On 10 February 2014 19:22, Andrew Maurer <[email protected]> wrote:
> Thanks for the response, Nick. I looked into pointers and have been able to make use of them. I'll give the background of the problem. I would be very interested to hear if anyone has thoughts on the efficiency of the code I have so far (see bottom of post).
>
> I am writing a Stata program, saveif, that will save a subset of observations of a dataset to a file. One method to accomplish this would be to do something like:
> preserve
> keep if...
> save...
> restore
>
> However, for large datasets (eg 20gb) and few observations to be saved (eg - a few mb of outliers), I expect that the preserve/restore method is grossly inefficient, since it involves writing the entire dataset from memory to hard-disk, then reading it back from hard-disk to memory.
>
> An alternative method to accomplish the task would be to somewhat manually "file write" the individual observations to a file, without having to clear and load back the dataset from memory. I have a nearly complete example here, where there is one part that has been hard-coded to the specific example of gnp96.dta. The code is still somewhat rough.
>
> Thank you,
> Andrew Maurer
>
>
>
> * Want to write a program that will save a set of observations into a dataset
> mata: mata clear
> clear all
>
> cap program drop saveif
> program define saveif
> syntax varlist [if] [in] using/, [replace]
> putmata `varlist' `if' `in', view
>
> * put a row vector called varnames to mata
> forval i = 1/`: word count `varlist'' {
> if `i' == 1 {
> mata: varnames = "`: word `i' of `varlist''"
> mata: vartypes = "`: type `: word `i' of `varlist'''"
> mata: varpointers = &`: word `i' of `varlist'' // pointers
> }
> else {
> mata: varnames = varnames,"`: word `i' of `varlist''"
> mata: vartypes = vartypes,"`: type `: word `i' of `varlist'''"
> mata: varpointers = varpointers,&`: word `i' of `varlist'' // pointers
> }
> }
> * save vector of varnames to file
> cap confirm new file "`using'"
> if _rc != 0 {
> di as error "`using' exists. replacing"
> rm "`using'"
> }
> mata: fh = fopen("`using'", "w")
> mata: fputmatrix(fh, varnames)
> mata: fputmatrix(fh, vartypes)
> mata: fputmatrix(fh, varpointers)
>
> * write observations of each variable to file
> forval i = 1/`: word count `varlist'' {
> mata: fputmatrix(fh, `: word `i' of `varlist'')
> }
>
> mata: fclose(fh)
> end
>
>
> capture mata mata drop recover_from_saveif()
> mata:
> void recover_from_saveif(string fileloc)
> {
>
> fh = fopen(fileloc, "r")
> varnames = fgetmatrix(fh)
> vartypes = fgetmatrix(fh)
> varpointers = fgetmatrix(fh)
> // ----- hard coded part!! try to get this into loop
> date = fgetmatrix(fh)
> gnp96 = fgetmatrix(fh)
> // -------------------------------------------------
> fclose(fh)
> varcount = cols(varnames)
>
> // ------- this loop not working yet. need to figure out syntax
> // foreach var of varnames, read var from file to mata
> for (i=1; i<=varcount;i++) {
> // varnames[1,i] = fgetmatrix(fh)
> }
> // -------------------------------------------------
>
> // foreach var of varnames, load var into stata with correct variable type
> for (i=1; i<=varcount;i++) {
> thisvarname = varnames[1,i] // eg contains "date"
> thisvartype = vartypes[1,i] // eg contains "int"
> thisvar = varpointers[1,i] // eg pointer to date vector
> if (i == 1) st_addobs(rows(*thisvar))
> st_store(., st_addvar(thisvartype,thisvarname),*thisvar)
> }
>
> }
> end
>
> cap program drop recover_from_saveif
> program define recover_from_saveif
> syntax using/, [replace]
>
> mata: recover_from_saveif("`using'")
>
> end
>
>
> sysuse gnp96.dta, clear
>
> saveif * in 1/5 using test5.txt
>
> clear
> recover_from_saveif using test5.txt
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: Monday, February 10, 2014 11:59 AM
> To: [email protected]
> Subject: Re: st: dynamic line execution in mata
>
> You are presuming that such a thing exists.
>
> In essence, Mata has no direct equivalent of macro substitution.
>
> Sometimes, the way to solve (similar) problems is by direct manipulation of strings. That is the theme of
>
> SJ-11-2 pr0052 . . . . Stata tip 100: Mata and the case of the missing macros
> . . . . . . . . . . . . . . . . . . . . . . . . W. Gould and N. J. Cox
> Q2/11 SJ 11(2):323--324 (no commands)
> tip showing how to do the equivalent of Stata's macro
> substitution in Mata
>
> Sometimes, using pointers is the answer.
>
> In this case, I'd guess that you want the Mata equivalent of some Stata operation and that there's a Mata way of doing it, but I would rather hear whether that is so than try to guess what the underlying problem is.
>
> Nick
> [email protected]
>
>
> On 10 February 2014 17:38, Andrew Maurer <[email protected]> wrote:
>> Hi Statalist,
>>
>> I am trying to find Mata's equivalent of Stata's macro expansion functionality. In the below example, I first define an object thisvar as the string "date" and I define the object date as the column vector 1 \ 2 \ 3 \ 4 \ 5. How can I return the contents of the "date" object by only referencing "thisvar"?
>>
>> In the line, rows( thisvar ), thisvar is simply the 1x1 matrix containing the string "date", so rows( thisvar ) returns: 1. What I am looking for is something like rows( `=thisvar' ), so as to return 5 rather than 1.
>>
>> ********* begin example *********
>>
>> mata
>>
>> i = 1
>> date = 1 \ 2 \ 3 \ 4 \ 5
>> varnames = "date", "price"
>> thisvar = varnames[1,i]
>> rows( thisvar ) // output: 1
>> rows( date ) // output: 5
>>
>> end
>>
>> ********* end example ***********
>>
>> Thank you,
>> Andrew Maurer
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/