Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Confirming whether a variable is binary or continuous
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Confirming whether a variable is binary or continuous
Date
Fri, 16 Mar 2012 22:34:25 +0000
You can also do this by e.g.
assert inlist(var, 0. 1)
Nick
On Fri, Mar 16, 2012 at 10:28 PM, daniel klein
<[email protected]> wrote:
> Bert,
>
> as you already realized, there is no possibility to tell whether a
> variable is intended to be a binary indicator or merely happens to
> only have values 0 and 1. For this purpose you will need more
> information on that variable. An option, indicating continuous
> variables, seems to be a good idea.
>
> However, I would like to add some thoughts here.
>
> Checking for binary variables -tabulate- is useful but the information
> in r(r) is not all it has to offer. Note that a variable with values 1
> and 2 will also result in r(r) = 2 and therfore will be declared a
> binary variable by your program. Here is how I checked for binary
> variables in one of my programs using -tabulate- with -matrow()-
> option
>
> [...]
> tempname M
> qui ta <var> ,matrow(`M')
> if (r(r) != 2) | (`M'[1, 1] != 0) | (`M'[2, 1] != 1) {
> di "<var> is not a binary variable"
> }
> [...]
>
> You will have to make sure <var> is not a string variable, as it is
> not allowed to use option -matrow()- with string variables. If you do
> not want to check, you can use -levelsof- to get the values of any
> variable. In any case, user-written software is not required here
> (although the first versions of -levelsof- were, at least partly,
> user-written by Nick Cox, as far as I know).
>
> I would not use -compress- as it is, in general, a bad idea to make
> (any) changes to the user's dataset if these changes are not the very
> purpose of your program. You could use -preserve- to avoid permanent
> changes but my guess is your program will execute faster if you just
> use -tabulate- (as shown above) in a loop for all numeric variables
> (not declared "continuous" by the user).
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/