Here is an update (bugfix of silly mistake and nicer labels) and a help
file.
-- Maarten
*----------------- begin histtab.ado -----------------------
*! version 1.0.1 MLB 23Jan2008
program define histtab
version 8.2
syntax varname, *
tempvar h x x2
twoway__histogram_gen `varlist', gen(`h' `x') `options'
qui gen byte `x2' = _n if `x' < .
label variable `x2' `"`: variable label `x''"'
local start = r(start)
local rstart : di %9.2g `start'
local rstart : list retokenize rstart
local bin = r(bin)
local width = r(width)
local end = `start' + `width'
local rend : di %9.2g `end'
local rend : list retokenize rend
tempname lab
label define `lab' 1 "`rstart' - `rend'"
forvalues i = 2/`r(bin)' {
local start = `end'
local rstart : di %9.2g `start'
local rstart : list retokenize rstart
local end = `end' + `width'
local rend : di %9.2g `end'
local rend : list retokenize rend
label define `lab' `i' "`rstart' - `rend'", add
}
label values `x2' `lab'
tabdisp `x2' if `x2' < ., cellvar(`h')
end
*------------------ end histtab.ado ------------------------
*----------------- begin histtab.hlp -----------------------
{smcl}
{* *! version 1.0.0 23Jan2008}{...}
{cmd:help histtab}
{hline}
{title:Title}
{p 4 35 2}
{hi:histtab} {hline 2} Tabulate a histogram
{title:Syntax}
{p 8 12 2}
{cmd:histtab}
{varname}
{weight} {ifin}
[{cmd:,}
{c -(}{it:discrete_options}|{it:continuous_options}{c )-}
{it:common_options}]
{pstd}
where {it:discrete_options} are
{it:discrete_options}{col 42}description
{hline 65}
{cmdab:d:iscrete}{...}
{col 42}specify data are discrete
{cmd:width(}{it:#}{cmd:)}{...}
{col 42}width of bins in {it:varname} units
{cmd:start(}{it:#}{cmd:)}{...}
{col 42}theoretical minimum value
{hline 65}
{pstd}
and where {it:continuous_options} are
{it:continuous_options}{col 42}description
{hline 65}
{cmd:bin(}{it:#}{cmd:)}{...}
{col 42}{it:#} of bins
{cmd:width(}{it:#}{cmd:)}{...}
{col 42}width of bins in {it:varname} units
{cmd:start(}{it:#}{cmd:)}{...}
{col 42}lower limit of first bin
{hline 65}
{pstd}
and where {it:common_options} are
{it:common_options}{col 42}description
{hline 65}
{cmdab:den:sity}{...}
{col 42}tabulate density (default)
{cmdab:frac:tion}{...}
{col 42}tabulate fractions
{cmdab:freq:uency}{...}
{col 42}tabulate frequencies
{cmd:display}{...}
{col 42}display (bin) start and width
{hline 64}
{pstd}
{cmd:fweight}s are allowed; see {help weights}.
{title:Description}
{pstd}
{cmd:histtab} displays what as a table what would be displayed as an
histogram.
{title:Options}
{title:Options for use in the continuous case}
{phang}
{opt bin(#)} and {opt width(#)} are alternatives. They specify how the
data are to be aggregated into bins; {opt bin()} by specifying the
number of bins (from which the width can be derived) and {opt width()}
by specifying the bin width (from which the number of bins can be
derived).
{pmore}
If neither option is specified, results are the same as if {opt bin(k)}
were specified, where
{phang3}
{it:k} = min{c -(}sqrt({it:N}), 10*ln({it:N})/ln(10){c )-}
{pmore}
and where {it:N} is the number of observations.
{phang}
{opt start(#)} specifies the theoretical minimum of varname. The
default is {opt start(m)}, where {it:m} is the observed minimum value
of {it:varname}.
{pmore}
Specify {opt start()} when you are concerned about sparse data, for
instance, if you know that {it:varname} can have a value of 0, but you
are concerned that 0 may not be observed.
{pmore}
{opt start(#)}, if specified, must be less than or equal to {it:m}, or
else an error will be issued.
{title:Options for use in the discrete case}
{phang}
{opt discrete} specifies that varname is discrete and that you want
each
unique value of {it:varname} to have its own bin (bar of histogram).
{phang}
{opt width(#)} is rarely specified in the discrete case; it specifies
the width of the bins. The default is {opt width(d)}, where {it:d} is
the observed minimum difference between the unique values of
{it:varname}.
{pmore}
Specify {opt width()} if you are concerned that your data are sparse.
For example, in theory {it:varname} could take on the values, say, 1,
2, 3, ..., 9, but because of the sparseness, perhaps only the values 2,
4, 7, and 8 are observed. Here the default width calculation would
produce {cmd:width(2)} and you would want to specify {cmd:width(1)}.
{phang}
{opt start(#)} is also rarely specified in the discrete case; it
specifies the theoretical minimum value of varname. The default is
{opt start(m)}, where {it:m} is the observed minimum value.
{pmore}
As with {opt width()}, you specify {opt start(#)} if you are concerned
that your data are sparse. In the previous example, you might also
want to specify {cmd:start(1)}. {opt start()} does nothing more than
add white space to the left side of the graph.
{pmore}
The value of {it:#} in {opt start()} must be less than or equal to
{it:m}, or an error will be issued.
{title:Common options}
{phang}
{opt density},
{opt fraction},
{opt frequency}, and
{opt percent} specify whether you want the histogram scaled to density
units, fractional units, frequencies, or percentages. {opt density}
is the default.
{pmore}
{opt density} scales the height of the bars so that the sum of their
areas equals 1.
{pmore}
{opt fraction} scales the height of the bars so that the sum of their
heights equals 1.
{pmore}
{opt frequency} scales the height of the bars so that each bar's
height is equal to the number of observations in the category. Thus
the sum of the heights is equal to the total number of observations.
{pmore}
{opt percent} scales the height of the bars so that the sum of their
heights equals 100.
{phang}
{cmd:display} indicates that a short note be displayed indicating
the number of bins, the lower limit of the first bin, and the width
of the bins. The output displayed is determined by whether the
{cmd:discrete} option was specified.
{title:Author}
{p 4 4}
Maarten L. Buis{break}
Vrije Universiteit Amsterdam{break}
Department of Social Research Methodology{break}
[email protected]
{p_end}
{title:Acknowledgement}
{phang}
Several programming tricks from Harrison (2005) are incorporated
in this program.
{title:References}
{phang}
Harrison, David A. (2005), "Stata tip 20: Generating histogram bin
variables". {it:The Stata Journal}, 5(2), pp. 280-281.
{title:Also see}
{psee}
Online:
{manhelp histogram R},
{manhelp twoway_histogram G:graph twoway histogram}
{p_end}
*------------------ end histtab.hlp ------------------------
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
__________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/