I appreciate your thorough answer Nick, it just couldnt get better.
I get right on to it, thanks again,
Dimitris
Nick Cox wrote:
>
> [email protected]
>
> > I want to generate the following frequency table:
> > for example,
> >
> > VARIABLE X
> > (0.5-1.0) (1.0-1.5) (1.5-2.0) (......)
> > (0.1-0.2)
> > V
> > A (0.2-0.3)
> > R
> > (0.3-0.4)
> > Y
> > (.......)
> >
> >
> > I have two continuous variables and I want to create a
> > table that is esentially
> > a scatterplot of number of observations, instead of points
> > in a graph. Its also
> > necessary that I must be able to determine the upper and
> > lower limits of the
> > intervals (as they dont have to be necessarily balanced).
> >
> > I dont know whether there is already a written command on
> > this, perhaps I was
> > not careful enough to find it.
>
> I don't think there is a command to do this in one, but
> no matter. As it happens, I'd argue that this is a problem
> for which there should not be a single command, as it splits
> quite naturally into two distinct problems.
>
> In essence, you want to set up a subdivision of variables
> into classes or bins and then get a cross-tabulation.
> Only the first requires any work.
>
> There was some discussion of similar issues
> in a thread on rounding down (and up) started
> on 22 June. This answer draws on a write-up
> of that thread, in press in the Stata Journal 3(4) 2003
> as a tip (see the end of
> http://www.stata-journal.com/sjfaq.html#types
> for an explanation of Stata tips).
>
> Suppose you want to round down, in multiples of some fixed number.
> For concreteness, say you want to round -mpg- in the auto data
> in multiples of 5, so that any values 10-14 get rounded to 10, any
> values 15-19 to 15, etc. -mpg- is simple in that only integer
> values occur; in many other cases we clearly have fractional parts
> to think about as well, although the solutions do not differ.
>
> Here is an easy solution: 5 * floor(mpg/5). -floor()-, added in
> Stata 8, always rounds down to the integer less than or equal to its
> argument. The name "floor" is due to Kenneth E. Iverson
> (1962), the principal architect of APL, who also suggested an
> expressive notation I can't emulate here as I'm font-challenged.
> For further discussion, see Knuth (1997, p.39) or Graham, Knuth and
> Patashnik (1994, Ch.3).
>
> As it happens, 5 * int(mpg/5) gives exactly the same result
> for -mpg- in the auto data, but in general whenever variables
> may be negative as well as positive,
>
> interval * floor(expression / interval)
>
> gives a more consistent classification.
>
> Let us compare this briefly with other possible solutions.
> -round(mpg, 5)- is different, as this rounds to the nearest
> multiple of 5, which could be either rounding up or rounding down.
> -round(mpg - 2.5, 5)- should be fine, but is also a little too
> much like a dodge.
>
> With the function -recode()- you need two dodges, say
> -recode(-mpg,-40,-35,-30,-25,-20,-15,-10)-. Note all the negative
> signs: negating and then negating to reverse it are necessary
> because -recode()- uses its numeric arguments as upper limits,
> i.e. it rounds up. Naturally, if you want rounding up, that
> is fine.
>
> -egen, cut()- offers another solution with option call -at(10(5)45)-.
> Being able to specify a numlist is nice, as
> compared with spelling out a comma-separated list, but you
> must also add a limit, here 45, which will not be used; otherwise
> with -at(10(5)40)- your highest class will be missing.
>
> Yutaka Aoki also suggested to me -mpg - mod(mpg,5)-
> which follows immediately once you see that rounding down
> amounts to subtracting the appropriate remainder. -mod(,)-,
> however, does not offer a correspondingly neat way of rounding up.
>
> The -floor- solution grows on one, and it has the merit that
> you do not need to spell out all the possible end values, with the
> risk of forgetting or mistyping some. Conversely, -recode()-
> and -egen, cut()- are not restricted to rounding in equal
> intervals and remain useful for more complicated problems.
>
> Without recapitulating the whole argument insofar as it applies to
> rounding up, -floor()-'s sibling -ceil()- (short for
> ceiling) gives a nice way of rounding up in equal intervals, and
> is easier to work with than expressions based on -int()-.
>
> So the example given looks like
>
> gen roundedx = 0.5 * floor(x/0.5)
> gen roundedy = 0.1 * floor(x/0.1)
>
> if you want rounding down, or the same with -ceil()-
> if you want rounding up, or something with the
> -recode()- function or -egen, cut()- if you want
> unequal intervals.
>
> tab roundedy roundedx
>
> then gives the tabulations. You probably want to
> keep variable labels etc. One way to do that
> is to use -copydesc- from SSC.
>
> Graham, R. L., D. E. Knuth and O. Patashnik. 1994.
> Concrete mathematics: a foundation for computer science.
> Reading, MA: Addison-Wesley.
>
> Iverson, K. E. 1962. A programming language.
> New York: John Wiley.
>
> Knuth, D. E. 1997. The art of computer programming: Volume
> 1, Fundamental algorithms. Reading, MA: Addison-Wesley.
>
> Nick
> [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
--
---------------------------------------------
Dimitris Christodoulou
Associate Researcher
School for Business and Regional Development
University of Wales, Bangor
Hen Coleg
LL57 2DG Bangor
UK
e-mail: [email protected]
---------------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/