Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: describe using (problem with abbrev )
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: describe using (problem with abbrev )
Date
Wed, 4 Dec 2013 21:53:19 +0000
Quite so. I didn't explain my point well enough, or indeed accurately
enough. -describe using- as it now is doesn't include expansion of
varlists; that is not to say that it could not be extended by
StataCorp to do so. As a programmer you will be familiar with projects
of your own that only got so far.
I doubt very much that
1. Commands like -describe using- actually read in the data from the
other dataset into memory, even temporarily. I suspect that
purpose-built C code reads the file from afar.
2. Stata's commands based on C code need pay any attention to syntax
in the sense of -syntax-, just as Mata pays no attention to that.
but we are trading guesses.
Nick
[email protected]
On 4 December 2013 20:42, Sergiy Radyakin <[email protected]> wrote:
> Nick,
>
> the inner workings of Stata are not known, but what Alan is asking
> about should be possible. We have a similar situation with the -use-
> command, which supports subset of data:
>
> use mpg using auto.dta, clear
>
> Here the varlist (mpg) is the 'future' varlist, not the one in the
> memory now, right?
>
> You may argue that what happens behind the scenes is that Stata:
> 1) clears the memory
> 2) loads full dataset
> 3) unabbreviates the variable list
> 4) drops the variables that were not mentioned.
>
> However Stata seems to unabbreviate the list of variables without
> loading the whole dataset into memory:
>
> version 9.0
> clear
> set mem 10m
>
> set obs 200000
> forval i=1/99 {
> capture generate byte x`i'=`i'
> }
>
> describe
> tempfile t
> save `t'
> clear
> set mem 3m
> use x3-x5 using `t'
> describe
>
> Obviously the test above makes sense in Stata before version 12.0,
> which came out with an automatic memory manager.
> The idea is that it can load x3,x4,x5 despite the full dataset does
> not fit into memory, hence we should conclude that the header
> information is processed separately, which is exactly what Alan is
> asking about in his question.
>
> It seems that specifically for the -use- command the variables list is
> treated specially, but although the same code is applicable to
> -describe- it is simply not reused there. I am yet to see any command
> that supplies a varlist (in the expected place after command name)
> referring to the future state of the data and it is not a built-in
> command (I am dying to see one). I would imagine that could also be
> implemented with a few tricks with -anything- in the syntax.
>
> I would go with a two-step solution, firstly getting a full
> description of the dataset, then filtering it for variables of
> interest. Ideally StataCorp could have provided a possibility to delay
> expansion of the varlist after parsing and an unab(s1,s2) string
> function, where s1 is a string to be treated as abbreviated varlist,
> and s2 is a string universe of variables. The result is a string of
> full variable names from s2 that satisfy s1. This is of course even
> currently possible to do yourself, but imho only if one dares to
> rewrite the -syntax- command.
>
> Best,
> Sergiy Radyakin
>
>
> On Tue, Dec 3, 2013 at 7:50 AM, Nick Cox <[email protected]> wrote:
>> Good catch by Daniel here.
>>
>> The reason that varlists with dashes are not allowed is presumably
>> that Stata can't expand what it doesn't know about. That is, the
>> dataset would have to be read in before Stata could expand a variable
>> name range, and that's the point: the dataset is being accessed
>> remotely.
>>
>> Nick
>> [email protected]
>>
>>
>> On 3 December 2013 12:40, daniel klein <[email protected]> wrote:
>>> Alan,
>>>
>>> this behavior is documented in -help describe-.
>>>
>>> "The varlist in the describe using syntax differs from standard Stata
>>> varlists in two ways. First, you cannot abbreviate variable names;
>>> that is, you have to type displacement rather than displ. However, you
>>> can use the wildcard character (~) to indicate abbreviations, for
>>> example, displ~. Second, you may not refer to a range of variables;
>>> specifying age-income is considered an error."
>>>
>>> Here is a sketch how you could allow the dash character
>>>
>>> *! version 1.0.0 03dec2013 Daniel Klein
>>>
>>> pr descdash
>>> vers 11.2
>>>
>>> syntax anything using [, * ]
>>>
>>> m : st_local("uservars", stritrim(st_local("anything")))
>>> loc uservars : subinstr loc uservars "- " "-" ,all
>>> loc uservars : subinstr loc uservars " -" "-" ,all
>>>
>>> qui d `using' ,varl
>>> loc allvars `r(varlist)'
>>>
>>> token `uservars'
>>> forv j = 1/`: word count `uservars'' {
>>> loc var : subinstr loc `j' "-" " " ,c(loc dsh)
>>> if (`dsh') {
>>> loc f : list posof "`: word 1 of `var''" in allvars
>>> loc t : list posof "`: word 2 of `var''" in allvars
>>> if (`t' < `f') {
>>> di as err "variables out of order"
>>> e 111
>>> }
>>> m : st_local("var", ///
>>> invtokens(tokens(st_local("allvars"))[(`f'..`t')]))
>>> }
>>> loc varlist `varlist' `var'
>>> }
>>>
>>> d `varlist' `using' ,`options'
>>> end
>>> e
>>>
>>> descdash y1-y2 using ajit_112213
>>>
>>> Best
>>> Daniel
>>>
>>> --
>>> Hi _ In Stata 13 (and also in Stata 12), it appears that the
>>> abbreviation with a dash "-" does not work with -describe using
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/